Overview

Brought to you by YData

Dataset statistics

Number of variables81
Number of observations478707
Missing cells18866608
Missing cells (%)48.7%
Total size in memory295.8 MiB
Average record size in memory648.0 B

Variable types

Text81

Dataset

DescriptionField Museum of Natural History (Zoology) Insect, Arachnid and Myriapod Collection 0000214-250121130708018
URLhttps://doi.org/10.15468/0ywfpc

Alerts

accessRights has constant value "https://www.fieldmuseum.org/field-museum-natural-history-conditions-and-suggested-norms-use-collections-data-and-images" Constant
license has constant value "https://creativecommons.org/publicdomain/zero/1.0/" Constant
rightsHolder has constant value "The Field Museum of Natural History" Constant
datasetID has constant value "insects-14-jan-2025" Constant
institutionCode has constant value "FMNH" Constant
collectionCode has constant value "Insects" Constant
ownerInstitutionCode has constant value "FMNH" Constant
basisOfRecord has constant value "PreservedSpecimen" Constant
samplingProtocol has constant value "South America" Constant
fieldNotes has constant value "Bolivia" Constant
higherGeographyID has constant value "Ichilo" Constant
minimumElevationInMeters has constant value "67" Constant
verbatimElevation has constant value "-17.816667" Constant
verticalDatum has constant value "-64.216667" Constant
pointRadiusSpatialFit has constant value "Patrick Belenky : Field Museum of Natural History - Department of Zoology" Constant
verbatimCoordinates has constant value "2013" Constant
verbatimLatitude has constant value "Latlong.net" Constant
identificationQualifier has constant value "cf." Constant
identificationRemarks has constant value "Hexacylloepus Hinton, 1940" Constant
namePublishedInID has constant value "Animalia Arthropoda Insecta Coleoptera Elmidae" Constant
taxonConceptID has constant value "Animalia" Constant
acceptedNameUsage has constant value "Insecta" Constant
parentNameUsage has constant value "Coleoptera" Constant
nameAccordingTo has constant value "Elmidae" Constant
phylum has constant value "Arthropoda" Constant
nomenclaturalCode has constant value "ICZN" Constant
recordNumber has 393011 (82.1%) missing values Missing
recordedBy has 79855 (16.7%) missing values Missing
individualCount has 27356 (5.7%) missing values Missing
sex has 26817 (5.6%) missing values Missing
lifeStage has 26817 (5.6%) missing values Missing
fieldNumber has 359459 (75.1%) missing values Missing
eventDate has 65321 (13.6%) missing values Missing
eventTime has 476277 (99.5%) missing values Missing
startDayOfYear has 55118 (11.5%) missing values Missing
endDayOfYear has 55117 (11.5%) missing values Missing
year has 69555 (14.5%) missing values Missing
month has 69683 (14.6%) missing values Missing
day has 86551 (18.1%) missing values Missing
habitat has 395207 (82.6%) missing values Missing
samplingProtocol has 478706 (> 99.9%) missing values Missing
fieldNotes has 478706 (> 99.9%) missing values Missing
locationID has 35666 (7.5%) missing values Missing
higherGeographyID has 478706 (> 99.9%) missing values Missing
higherGeography has 315671 (65.9%) missing values Missing
continent has 25889 (5.4%) missing values Missing
islandGroup has 464544 (97.0%) missing values Missing
island has 462721 (96.7%) missing values Missing
country has 30269 (6.3%) missing values Missing
stateProvince has 63499 (13.3%) missing values Missing
county has 201710 (42.1%) missing values Missing
locality has 53551 (11.2%) missing values Missing
minimumElevationInMeters has 478706 (> 99.9%) missing values Missing
maximumElevationInMeters has 478702 (> 99.9%) missing values Missing
verbatimElevation has 478706 (> 99.9%) missing values Missing
verticalDatum has 478706 (> 99.9%) missing values Missing
minimumDepthInMeters has 477976 (99.8%) missing values Missing
locationRemarks has 387879 (81.0%) missing values Missing
decimalLatitude has 127471 (26.6%) missing values Missing
decimalLongitude has 127471 (26.6%) missing values Missing
geodeticDatum has 477980 (99.8%) missing values Missing
coordinateUncertaintyInMeters has 477312 (99.7%) missing values Missing
pointRadiusSpatialFit has 478706 (> 99.9%) missing values Missing
verbatimCoordinates has 478706 (> 99.9%) missing values Missing
verbatimLatitude has 478706 (> 99.9%) missing values Missing
georeferencedBy has 176818 (36.9%) missing values Missing
georeferencedDate has 180845 (37.8%) missing values Missing
georeferenceProtocol has 163939 (34.2%) missing values Missing
georeferenceSources has 473327 (98.9%) missing values Missing
georeferenceRemarks has 443963 (92.7%) missing values Missing
identificationQualifier has 478083 (99.9%) missing values Missing
typeStatus has 477602 (99.8%) missing values Missing
identifiedBy has 243207 (50.8%) missing values Missing
dateIdentified has 342076 (71.5%) missing values Missing
identificationRemarks has 478706 (> 99.9%) missing values Missing
namePublishedInID has 478706 (> 99.9%) missing values Missing
taxonConceptID has 478706 (> 99.9%) missing values Missing
scientificName has 23187 (4.8%) missing values Missing
acceptedNameUsage has 478706 (> 99.9%) missing values Missing
parentNameUsage has 478706 (> 99.9%) missing values Missing
nameAccordingTo has 478706 (> 99.9%) missing values Missing
higherClassification has 23188 (4.8%) missing values Missing
kingdom has 23187 (4.8%) missing values Missing
phylum has 23191 (4.8%) missing values Missing
class has 23379 (4.9%) missing values Missing
order has 25166 (5.3%) missing values Missing
family has 56649 (11.8%) missing values Missing
genus has 103664 (21.7%) missing values Missing
subgenus has 411166 (85.9%) missing values Missing
specificEpithet has 167744 (35.0%) missing values Missing
infraspecificEpithet has 476240 (99.5%) missing values Missing
taxonRank has 476240 (99.5%) missing values Missing
scientificNameAuthorship has 478701 (> 99.9%) missing values Missing
gbifID has unique values Unique
occurrenceID has unique values Unique
organismID has unique values Unique

Reproduction

Analysis started2025-01-23 23:13:38.251858
Analysis finished2025-01-23 23:13:54.900968
Duration16.65 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct478707
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
2025-01-23T18:13:55.398464image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters4787070
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique478707 ?
Unique (%)100.0%

Sample

1st row1291024825
2nd row1416884549
3rd row1142391473
4th row2805038471
5th row2422209631
ValueCountFrequency (%)
1291024825 1
 
< 0.1%
1271342429 1
 
< 0.1%
2805038471 1
 
< 0.1%
2422209631 1
 
< 0.1%
1806601120 1
 
< 0.1%
2350117727 1
 
< 0.1%
2466119880 1
 
< 0.1%
1802443244 1
 
< 0.1%
1142383662 1
 
< 0.1%
2564216403 1
 
< 0.1%
Other values (478697) 478697
> 99.9%
2025-01-23T18:13:55.818682image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 907077
18.9%
2 675389
14.1%
4 591943
12.4%
3 457306
9.6%
0 390324
8.2%
8 382291
8.0%
5 363307
7.6%
6 343669
 
7.2%
7 338622
 
7.1%
9 337142
 
7.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4787070
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 907077
18.9%
2 675389
14.1%
4 591943
12.4%
3 457306
9.6%
0 390324
8.2%
8 382291
8.0%
5 363307
7.6%
6 343669
 
7.2%
7 338622
 
7.1%
9 337142
 
7.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4787070
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 907077
18.9%
2 675389
14.1%
4 591943
12.4%
3 457306
9.6%
0 390324
8.2%
8 382291
8.0%
5 363307
7.6%
6 343669
 
7.2%
7 338622
 
7.1%
9 337142
 
7.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4787070
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 907077
18.9%
2 675389
14.1%
4 591943
12.4%
3 457306
9.6%
0 390324
8.2%
8 382291
8.0%
5 363307
7.6%
6 343669
 
7.2%
7 338622
 
7.1%
9 337142
 
7.0%

accessRights
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
2025-01-23T18:13:55.903363image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length119
Median length119
Mean length119
Min length119

Characters and Unicode

Total characters56966133
Distinct characters23
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhttps://www.fieldmuseum.org/field-museum-natural-history-conditions-and-suggested-norms-use-collections-data-and-images
2nd rowhttps://www.fieldmuseum.org/field-museum-natural-history-conditions-and-suggested-norms-use-collections-data-and-images
3rd rowhttps://www.fieldmuseum.org/field-museum-natural-history-conditions-and-suggested-norms-use-collections-data-and-images
4th rowhttps://www.fieldmuseum.org/field-museum-natural-history-conditions-and-suggested-norms-use-collections-data-and-images
5th rowhttps://www.fieldmuseum.org/field-museum-natural-history-conditions-and-suggested-norms-use-collections-data-and-images
ValueCountFrequency (%)
https://www.fieldmuseum.org/field-museum-natural-history-conditions-and-suggested-norms-use-collections-data-and-images 478707
100.0%
2025-01-23T18:13:56.029206image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 5744484
 
10.1%
s 5265777
 
9.2%
e 4308363
 
7.6%
t 3829656
 
6.7%
u 3350949
 
5.9%
a 3350949
 
5.9%
n 3350949
 
5.9%
o 3350949
 
5.9%
i 3350949
 
5.9%
d 3350949
 
5.9%
Other values (13) 17712159
31.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 48349407
84.9%
Dash Punctuation 5744484
 
10.1%
Other Punctuation 2872242
 
5.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 5265777
10.9%
e 4308363
 
8.9%
t 3829656
 
7.9%
u 3350949
 
6.9%
a 3350949
 
6.9%
n 3350949
 
6.9%
o 3350949
 
6.9%
i 3350949
 
6.9%
d 3350949
 
6.9%
m 2872242
 
5.9%
Other values (9) 11967675
24.8%
Other Punctuation
ValueCountFrequency (%)
/ 1436121
50.0%
. 957414
33.3%
: 478707
 
16.7%
Dash Punctuation
ValueCountFrequency (%)
- 5744484
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 48349407
84.9%
Common 8616726
 
15.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 5265777
10.9%
e 4308363
 
8.9%
t 3829656
 
7.9%
u 3350949
 
6.9%
a 3350949
 
6.9%
n 3350949
 
6.9%
o 3350949
 
6.9%
i 3350949
 
6.9%
d 3350949
 
6.9%
m 2872242
 
5.9%
Other values (9) 11967675
24.8%
Common
ValueCountFrequency (%)
- 5744484
66.7%
/ 1436121
 
16.7%
. 957414
 
11.1%
: 478707
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 56966133
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 5744484
 
10.1%
s 5265777
 
9.2%
e 4308363
 
7.6%
t 3829656
 
6.7%
u 3350949
 
5.9%
a 3350949
 
5.9%
n 3350949
 
5.9%
o 3350949
 
5.9%
i 3350949
 
5.9%
d 3350949
 
5.9%
Other values (13) 17712159
31.1%

license
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
2025-01-23T18:13:56.087144image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length50
Median length50
Mean length50
Min length50

Characters and Unicode

Total characters23935350
Distinct characters24
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhttps://creativecommons.org/publicdomain/zero/1.0/
2nd rowhttps://creativecommons.org/publicdomain/zero/1.0/
3rd rowhttps://creativecommons.org/publicdomain/zero/1.0/
4th rowhttps://creativecommons.org/publicdomain/zero/1.0/
5th rowhttps://creativecommons.org/publicdomain/zero/1.0/
ValueCountFrequency (%)
https://creativecommons.org/publicdomain/zero/1.0 478707
100.0%
2025-01-23T18:13:56.197891image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 2872242
 
12.0%
o 2393535
 
10.0%
i 1436121
 
6.0%
m 1436121
 
6.0%
c 1436121
 
6.0%
r 1436121
 
6.0%
e 1436121
 
6.0%
t 1436121
 
6.0%
. 957414
 
4.0%
n 957414
 
4.0%
Other values (14) 8138019
34.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18669573
78.0%
Other Punctuation 4308363
 
18.0%
Decimal Number 957414
 
4.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 2393535
12.8%
i 1436121
 
7.7%
m 1436121
 
7.7%
c 1436121
 
7.7%
r 1436121
 
7.7%
e 1436121
 
7.7%
t 1436121
 
7.7%
n 957414
 
5.1%
a 957414
 
5.1%
s 957414
 
5.1%
Other values (9) 4787070
25.6%
Other Punctuation
ValueCountFrequency (%)
/ 2872242
66.7%
. 957414
 
22.2%
: 478707
 
11.1%
Decimal Number
ValueCountFrequency (%)
1 478707
50.0%
0 478707
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18669573
78.0%
Common 5265777
 
22.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 2393535
12.8%
i 1436121
 
7.7%
m 1436121
 
7.7%
c 1436121
 
7.7%
r 1436121
 
7.7%
e 1436121
 
7.7%
t 1436121
 
7.7%
n 957414
 
5.1%
a 957414
 
5.1%
s 957414
 
5.1%
Other values (9) 4787070
25.6%
Common
ValueCountFrequency (%)
/ 2872242
54.5%
. 957414
 
18.2%
1 478707
 
9.1%
: 478707
 
9.1%
0 478707
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23935350
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 2872242
 
12.0%
o 2393535
 
10.0%
i 1436121
 
6.0%
m 1436121
 
6.0%
c 1436121
 
6.0%
r 1436121
 
6.0%
e 1436121
 
6.0%
t 1436121
 
6.0%
. 957414
 
4.0%
n 957414
 
4.0%
Other values (14) 8138019
34.0%
Distinct4563
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
2025-01-23T18:13:56.310482image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length21
Mean length21
Min length21

Characters and Unicode

Total characters10052847
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique843 ?
Unique (%)0.2%

Sample

1st row2024-01-29T10:46-0600
2nd row2023-11-16T03:35-0600
3rd row2024-01-29T11:11-0600
4th row2023-11-17T02:42-0600
5th row2023-11-15T17:06-0600
ValueCountFrequency (%)
2023-11-18t17:51-0600 3225
 
0.7%
2023-11-18t17:41-0600 3159
 
0.7%
2023-11-18t17:46-0600 3139
 
0.7%
2023-11-18t17:50-0600 2667
 
0.6%
2023-11-18t17:44-0600 2536
 
0.5%
2023-11-18t17:45-0600 2520
 
0.5%
2023-11-18t17:43-0600 2302
 
0.5%
2023-11-18t17:47-0600 2215
 
0.5%
2023-11-18t17:49-0600 2068
 
0.4%
2023-11-30t18:15-0600 1479
 
0.3%
Other values (4553) 453397
94.7%
2025-01-23T18:13:56.492727image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2376060
23.6%
1 1665102
16.6%
- 1436121
14.3%
2 1396352
13.9%
6 653928
 
6.5%
3 507342
 
5.0%
T 478707
 
4.8%
: 478707
 
4.8%
4 293253
 
2.9%
7 210288
 
2.1%
Other values (3) 556987
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7659312
76.2%
Dash Punctuation 1436121
 
14.3%
Uppercase Letter 478707
 
4.8%
Other Punctuation 478707
 
4.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2376060
31.0%
1 1665102
21.7%
2 1396352
18.2%
6 653928
 
8.5%
3 507342
 
6.6%
4 293253
 
3.8%
7 210288
 
2.7%
5 203608
 
2.7%
9 198793
 
2.6%
8 154586
 
2.0%
Dash Punctuation
ValueCountFrequency (%)
- 1436121
100.0%
Uppercase Letter
ValueCountFrequency (%)
T 478707
100.0%
Other Punctuation
ValueCountFrequency (%)
: 478707
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 9574140
95.2%
Latin 478707
 
4.8%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2376060
24.8%
1 1665102
17.4%
- 1436121
15.0%
2 1396352
14.6%
6 653928
 
6.8%
3 507342
 
5.3%
: 478707
 
5.0%
4 293253
 
3.1%
7 210288
 
2.2%
5 203608
 
2.1%
Other values (2) 353379
 
3.7%
Latin
ValueCountFrequency (%)
T 478707
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10052847
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2376060
23.6%
1 1665102
16.6%
- 1436121
14.3%
2 1396352
13.9%
6 653928
 
6.5%
3 507342
 
5.0%
T 478707
 
4.8%
: 478707
 
4.8%
4 293253
 
2.9%
7 210288
 
2.1%
Other values (3) 556987
 
5.5%

rightsHolder
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
2025-01-23T18:13:56.559232image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length35
Median length35
Mean length35
Min length35

Characters and Unicode

Total characters16754745
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowThe Field Museum of Natural History
2nd rowThe Field Museum of Natural History
3rd rowThe Field Museum of Natural History
4th rowThe Field Museum of Natural History
5th rowThe Field Museum of Natural History
ValueCountFrequency (%)
the 478707
16.7%
field 478707
16.7%
museum 478707
16.7%
of 478707
16.7%
natural 478707
16.7%
history 478707
16.7%
2025-01-23T18:13:56.667239image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2393535
14.3%
e 1436121
 
8.6%
u 1436121
 
8.6%
s 957414
 
5.7%
i 957414
 
5.7%
l 957414
 
5.7%
r 957414
 
5.7%
t 957414
 
5.7%
a 957414
 
5.7%
o 957414
 
5.7%
Other values (10) 4787070
28.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11967675
71.4%
Space Separator 2393535
 
14.3%
Uppercase Letter 2393535
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1436121
12.0%
u 1436121
12.0%
s 957414
8.0%
i 957414
8.0%
l 957414
8.0%
r 957414
8.0%
t 957414
8.0%
a 957414
8.0%
o 957414
8.0%
f 478707
 
4.0%
Other values (4) 1914828
16.0%
Uppercase Letter
ValueCountFrequency (%)
H 478707
20.0%
N 478707
20.0%
T 478707
20.0%
M 478707
20.0%
F 478707
20.0%
Space Separator
ValueCountFrequency (%)
2393535
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 14361210
85.7%
Common 2393535
 
14.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1436121
 
10.0%
u 1436121
 
10.0%
s 957414
 
6.7%
i 957414
 
6.7%
l 957414
 
6.7%
r 957414
 
6.7%
t 957414
 
6.7%
a 957414
 
6.7%
o 957414
 
6.7%
f 478707
 
3.3%
Other values (9) 4308363
30.0%
Common
ValueCountFrequency (%)
2393535
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16754745
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2393535
14.3%
e 1436121
 
8.6%
u 1436121
 
8.6%
s 957414
 
5.7%
i 957414
 
5.7%
l 957414
 
5.7%
r 957414
 
5.7%
t 957414
 
5.7%
a 957414
 
5.7%
o 957414
 
5.7%
Other values (10) 4787070
28.6%

datasetID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
2025-01-23T18:13:56.722431image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters9095433
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowinsects-14-jan-2025
2nd rowinsects-14-jan-2025
3rd rowinsects-14-jan-2025
4th rowinsects-14-jan-2025
5th rowinsects-14-jan-2025
ValueCountFrequency (%)
insects-14-jan-2025 478707
100.0%
2025-01-23T18:13:56.826751image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 1436121
15.8%
n 957414
10.5%
s 957414
10.5%
2 957414
10.5%
i 478707
 
5.3%
e 478707
 
5.3%
c 478707
 
5.3%
t 478707
 
5.3%
1 478707
 
5.3%
4 478707
 
5.3%
Other values (4) 1914828
21.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4787070
52.6%
Decimal Number 2872242
31.6%
Dash Punctuation 1436121
 
15.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 957414
20.0%
s 957414
20.0%
i 478707
10.0%
e 478707
10.0%
c 478707
10.0%
t 478707
10.0%
j 478707
10.0%
a 478707
10.0%
Decimal Number
ValueCountFrequency (%)
2 957414
33.3%
1 478707
16.7%
4 478707
16.7%
0 478707
16.7%
5 478707
16.7%
Dash Punctuation
ValueCountFrequency (%)
- 1436121
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4787070
52.6%
Common 4308363
47.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 957414
20.0%
s 957414
20.0%
i 478707
10.0%
e 478707
10.0%
c 478707
10.0%
t 478707
10.0%
j 478707
10.0%
a 478707
10.0%
Common
ValueCountFrequency (%)
- 1436121
33.3%
2 957414
22.2%
1 478707
 
11.1%
4 478707
 
11.1%
0 478707
 
11.1%
5 478707
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9095433
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 1436121
15.8%
n 957414
10.5%
s 957414
10.5%
2 957414
10.5%
i 478707
 
5.3%
e 478707
 
5.3%
c 478707
 
5.3%
t 478707
 
5.3%
1 478707
 
5.3%
4 478707
 
5.3%
Other values (4) 1914828
21.1%

institutionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
2025-01-23T18:13:56.870629image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1914828
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFMNH
2nd rowFMNH
3rd rowFMNH
4th rowFMNH
5th rowFMNH
ValueCountFrequency (%)
fmnh 478707
100.0%
2025-01-23T18:13:56.969357image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
F 478707
25.0%
M 478707
25.0%
N 478707
25.0%
H 478707
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1914828
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
F 478707
25.0%
M 478707
25.0%
N 478707
25.0%
H 478707
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1914828
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
F 478707
25.0%
M 478707
25.0%
N 478707
25.0%
H 478707
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1914828
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
F 478707
25.0%
M 478707
25.0%
N 478707
25.0%
H 478707
25.0%

collectionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
2025-01-23T18:13:57.013254image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters3350949
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowInsects
2nd rowInsects
3rd rowInsects
4th rowInsects
5th rowInsects
ValueCountFrequency (%)
insects 478707
100.0%
2025-01-23T18:13:57.111964image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 957414
28.6%
I 478707
14.3%
n 478707
14.3%
e 478707
14.3%
c 478707
14.3%
t 478707
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2872242
85.7%
Uppercase Letter 478707
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 957414
33.3%
n 478707
16.7%
e 478707
16.7%
c 478707
16.7%
t 478707
16.7%
Uppercase Letter
ValueCountFrequency (%)
I 478707
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3350949
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 957414
28.6%
I 478707
14.3%
n 478707
14.3%
e 478707
14.3%
c 478707
14.3%
t 478707
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3350949
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 957414
28.6%
I 478707
14.3%
n 478707
14.3%
e 478707
14.3%
c 478707
14.3%
t 478707
14.3%

ownerInstitutionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
2025-01-23T18:13:57.153726image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1914828
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFMNH
2nd rowFMNH
3rd rowFMNH
4th rowFMNH
5th rowFMNH
ValueCountFrequency (%)
fmnh 478707
100.0%
2025-01-23T18:13:57.247309image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
F 478707
25.0%
M 478707
25.0%
N 478707
25.0%
H 478707
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1914828
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
F 478707
25.0%
M 478707
25.0%
N 478707
25.0%
H 478707
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1914828
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
F 478707
25.0%
M 478707
25.0%
N 478707
25.0%
H 478707
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1914828
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
F 478707
25.0%
M 478707
25.0%
N 478707
25.0%
H 478707
25.0%

basisOfRecord
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
2025-01-23T18:13:57.296569image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length17
Mean length17
Min length17

Characters and Unicode

Total characters8138019
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPreservedSpecimen
2nd rowPreservedSpecimen
3rd rowPreservedSpecimen
4th rowPreservedSpecimen
5th rowPreservedSpecimen
ValueCountFrequency (%)
preservedspecimen 478707
100.0%
2025-01-23T18:13:57.402194image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2393535
29.4%
r 957414
 
11.8%
P 478707
 
5.9%
s 478707
 
5.9%
v 478707
 
5.9%
d 478707
 
5.9%
S 478707
 
5.9%
p 478707
 
5.9%
c 478707
 
5.9%
i 478707
 
5.9%
Other values (2) 957414
 
11.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7180605
88.2%
Uppercase Letter 957414
 
11.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2393535
33.3%
r 957414
 
13.3%
s 478707
 
6.7%
v 478707
 
6.7%
d 478707
 
6.7%
p 478707
 
6.7%
c 478707
 
6.7%
i 478707
 
6.7%
m 478707
 
6.7%
n 478707
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
P 478707
50.0%
S 478707
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8138019
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2393535
29.4%
r 957414
 
11.8%
P 478707
 
5.9%
s 478707
 
5.9%
v 478707
 
5.9%
d 478707
 
5.9%
S 478707
 
5.9%
p 478707
 
5.9%
c 478707
 
5.9%
i 478707
 
5.9%
Other values (2) 957414
 
11.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8138019
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2393535
29.4%
r 957414
 
11.8%
P 478707
 
5.9%
s 478707
 
5.9%
v 478707
 
5.9%
d 478707
 
5.9%
S 478707
 
5.9%
p 478707
 
5.9%
c 478707
 
5.9%
i 478707
 
5.9%
Other values (2) 957414
 
11.8%

occurrenceID
Text

Unique 

Distinct478707
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
2025-01-23T18:13:57.626336image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length36
Mean length36
Min length36

Characters and Unicode

Total characters17233452
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique478707 ?
Unique (%)100.0%

Sample

1st row61faec85-d4ac-4fc2-9a4a-7d444ff68715
2nd row61fb6d1e-f7fb-4ce0-a863-0c0be49529a5
3rd row61fb9f75-477a-4dc1-bebf-f0d03d52c3b7
4th row61fc0619-0f84-42c0-950b-2666d77434b5
5th row61fca648-3723-4036-9a2b-535925777457
ValueCountFrequency (%)
61faec85-d4ac-4fc2-9a4a-7d444ff68715 1
 
< 0.1%
620c58c8-89d8-4511-95b1-d9aeb2f07296 1
 
< 0.1%
61fc0619-0f84-42c0-950b-2666d77434b5 1
 
< 0.1%
61fca648-3723-4036-9a2b-535925777457 1
 
< 0.1%
61fe3862-c5be-4be1-810d-d3e360bb9e14 1
 
< 0.1%
62038ac8-58c5-4223-a59c-eed454321f72 1
 
< 0.1%
6203d969-52f2-4e85-b6e7-0de67071936e 1
 
< 0.1%
6204cee6-ec24-484e-85e2-831580215e21 1
 
< 0.1%
620500e7-47ff-4146-bd0e-e9b232947e34 1
 
< 0.1%
620566b5-53b3-448f-97a1-59db4ac175ed 1
 
< 0.1%
Other values (478697) 478697
> 99.9%
2025-01-23T18:13:57.928168image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 1914828
 
11.1%
4 1375728
 
8.0%
8 1018529
 
5.9%
a 1017699
 
5.9%
9 1017199
 
5.9%
b 1016975
 
5.9%
2 898556
 
5.2%
3 898448
 
5.2%
5 898271
 
5.2%
d 897957
 
5.2%
Other values (7) 6279262
36.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9695742
56.3%
Lowercase Letter 5622882
32.6%
Dash Punctuation 1914828
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 1375728
14.2%
8 1018529
10.5%
9 1017199
10.5%
2 898556
9.3%
3 898448
9.3%
5 898271
9.3%
0 897908
9.3%
6 897350
9.3%
7 897099
9.3%
1 896654
9.2%
Lowercase Letter
ValueCountFrequency (%)
a 1017699
18.1%
b 1016975
18.1%
d 897957
16.0%
c 897248
16.0%
f 897157
16.0%
e 895846
15.9%
Dash Punctuation
ValueCountFrequency (%)
- 1914828
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 11610570
67.4%
Latin 5622882
32.6%

Most frequent character per script

Common
ValueCountFrequency (%)
- 1914828
16.5%
4 1375728
11.8%
8 1018529
8.8%
9 1017199
8.8%
2 898556
7.7%
3 898448
7.7%
5 898271
7.7%
0 897908
7.7%
6 897350
7.7%
7 897099
7.7%
Latin
ValueCountFrequency (%)
a 1017699
18.1%
b 1016975
18.1%
d 897957
16.0%
c 897248
16.0%
f 897157
16.0%
e 895846
15.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17233452
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 1914828
 
11.1%
4 1375728
 
8.0%
8 1018529
 
5.9%
a 1017699
 
5.9%
9 1017199
 
5.9%
b 1016975
 
5.9%
2 898556
 
5.2%
3 898448
 
5.2%
5 898271
 
5.2%
d 897957
 
5.2%
Other values (7) 6279262
36.4%
Distinct478412
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
2025-01-23T18:13:58.110947image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters9574140
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique478122 ?
Unique (%)99.9%

Sample

1st rowFMNHINS 0003 494 482
2nd rowFMNHINS 0003 561 334
3rd rowFMNHINS 0002 887 559
4th rowFMNHINS 0003 740 024
5th rowFMNHINS 0004 097 745
ValueCountFrequency (%)
fmnhins 478707
25.0%
0003 231056
 
12.1%
0000 124208
 
6.5%
0004 100166
 
5.2%
0002 23267
 
1.2%
821 3457
 
0.2%
820 3362
 
0.2%
828 3112
 
0.2%
371 3079
 
0.2%
819 3029
 
0.2%
Other values (996) 941385
49.2%
2025-01-23T18:13:58.343318image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1893229
19.8%
1436121
15.0%
N 957414
10.0%
3 501414
 
5.2%
F 478707
 
5.0%
H 478707
 
5.0%
I 478707
 
5.0%
S 478707
 
5.0%
M 478707
 
5.0%
4 381889
 
4.0%
Other values (7) 2010538
21.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4787070
50.0%
Uppercase Letter 3350949
35.0%
Space Separator 1436121
 
15.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1893229
39.5%
3 501414
 
10.5%
4 381889
 
8.0%
1 341316
 
7.1%
8 307510
 
6.4%
2 303673
 
6.3%
5 278480
 
5.8%
7 270896
 
5.7%
9 258344
 
5.4%
6 250319
 
5.2%
Uppercase Letter
ValueCountFrequency (%)
N 957414
28.6%
F 478707
14.3%
H 478707
14.3%
I 478707
14.3%
S 478707
14.3%
M 478707
14.3%
Space Separator
ValueCountFrequency (%)
1436121
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6223191
65.0%
Latin 3350949
35.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1893229
30.4%
1436121
23.1%
3 501414
 
8.1%
4 381889
 
6.1%
1 341316
 
5.5%
8 307510
 
4.9%
2 303673
 
4.9%
5 278480
 
4.5%
7 270896
 
4.4%
9 258344
 
4.2%
Latin
ValueCountFrequency (%)
N 957414
28.6%
F 478707
14.3%
H 478707
14.3%
I 478707
14.3%
S 478707
14.3%
M 478707
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9574140
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1893229
19.8%
1436121
15.0%
N 957414
10.0%
3 501414
 
5.2%
F 478707
 
5.0%
H 478707
 
5.0%
I 478707
 
5.0%
S 478707
 
5.0%
M 478707
 
5.0%
4 381889
 
4.0%
Other values (7) 2010538
21.0%

recordNumber
Text

Missing 

Distinct21579
Distinct (%)25.2%
Missing393011
Missing (%)82.1%
Memory size3.7 MiB
2025-01-23T18:13:58.530602image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length79
Median length38
Mean length9.915830377
Min length1

Characters and Unicode

Total characters849747
Distinct characters86
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9697 ?
Unique (%)11.3%

Sample

1st row, MICH 366 1983
2nd row#2996
3rd rowP#95-94
4th row, Atta colombica nest #1
5th rowSuperDB CollEventID: 14226
ValueCountFrequency (%)
14745
 
9.2%
superdb 6638
 
4.1%
colleventid 6638
 
4.1%
can 4257
 
2.6%
mich 2249
 
1.4%
da 2247
 
1.4%
1983 1547
 
1.0%
1978 1490
 
0.9%
1979 1476
 
0.9%
ena 1433
 
0.9%
Other values (15923) 117975
73.4%
2025-01-23T18:13:58.798013image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 77222
 
9.1%
75013
 
8.8%
9 46273
 
5.4%
2 45090
 
5.3%
8 36277
 
4.3%
0 34447
 
4.1%
7 33491
 
3.9%
4 31317
 
3.7%
3 30873
 
3.6%
6 28957
 
3.4%
Other values (76) 410787
48.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 389512
45.8%
Uppercase Letter 170868
20.1%
Lowercase Letter 122234
 
14.4%
Space Separator 75013
 
8.8%
Other Punctuation 54364
 
6.4%
Dash Punctuation 18900
 
2.2%
Connector Punctuation 17582
 
2.1%
Close Punctuation 632
 
0.1%
Open Punctuation 632
 
0.1%
Math Symbol 7
 
< 0.1%
Other values (2) 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 18817
15.4%
l 17394
14.2%
t 12110
9.9%
r 11710
9.6%
o 10026
8.2%
n 9573
7.8%
p 8822
7.2%
u 8724
7.1%
v 6832
 
5.6%
a 5532
 
4.5%
Other values (17) 12694
10.4%
Uppercase Letter
ValueCountFrequency (%)
C 19193
11.2%
D 18273
10.7%
I 16854
9.9%
B 16274
9.5%
S 15052
8.8%
N 13586
 
8.0%
A 12122
 
7.1%
E 10439
 
6.1%
M 7162
 
4.2%
W 4667
 
2.7%
Other values (16) 37246
21.8%
Other Punctuation
ValueCountFrequency (%)
# 27925
51.4%
, 16100
29.6%
: 6911
 
12.7%
. 2823
 
5.2%
/ 490
 
0.9%
& 64
 
0.1%
; 24
 
< 0.1%
? 16
 
< 0.1%
" 8
 
< 0.1%
' 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 77222
19.8%
9 46273
11.9%
2 45090
11.6%
8 36277
9.3%
0 34447
8.8%
7 33491
8.6%
4 31317
8.0%
3 30873
 
7.9%
6 28957
 
7.4%
5 25565
 
6.6%
Close Punctuation
ValueCountFrequency (%)
) 629
99.5%
} 2
 
0.3%
] 1
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 629
99.5%
[ 3
 
0.5%
Math Symbol
ValueCountFrequency (%)
= 4
57.1%
+ 3
42.9%
Space Separator
ValueCountFrequency (%)
75013
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18900
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 17582
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 2
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 556645
65.5%
Latin 293102
34.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 19193
 
6.5%
e 18817
 
6.4%
D 18273
 
6.2%
l 17394
 
5.9%
I 16854
 
5.8%
B 16274
 
5.6%
S 15052
 
5.1%
N 13586
 
4.6%
A 12122
 
4.1%
t 12110
 
4.1%
Other values (43) 133427
45.5%
Common
ValueCountFrequency (%)
1 77222
13.9%
75013
13.5%
9 46273
 
8.3%
2 45090
 
8.1%
8 36277
 
6.5%
0 34447
 
6.2%
7 33491
 
6.0%
4 31317
 
5.6%
3 30873
 
5.5%
6 28957
 
5.2%
Other values (23) 117685
21.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 849746
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 77222
 
9.1%
75013
 
8.8%
9 46273
 
5.4%
2 45090
 
5.3%
8 36277
 
4.3%
0 34447
 
4.1%
7 33491
 
3.9%
4 31317
 
3.7%
3 30873
 
3.6%
6 28957
 
3.4%
Other values (75) 410786
48.3%
None
ValueCountFrequency (%)
ñ 1
100.0%

recordedBy
Text

Missing 

Distinct9404
Distinct (%)2.4%
Missing79855
Missing (%)16.7%
Memory size3.7 MiB
2025-01-23T18:13:58.997171image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length90
Median length87
Mean length14.24649745
Min length2

Characters and Unicode

Total characters5682244
Distinct characters93
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3568 ?
Unique (%)0.9%

Sample

1st rowH. G. Nelson
2nd rowF. N. Young
3rd rowS. B. Peck
4th rowA. F. Newton
5th rowS. B. Peck
ValueCountFrequency (%)
h 80375
 
6.2%
j 71292
 
5.5%
b 69470
 
5.3%
r 63679
 
4.9%
e 55828
 
4.3%
a 53487
 
4.1%
f 51749
 
4.0%
w 49008
 
3.8%
s 48433
 
3.7%
m 41557
 
3.2%
Other values (5875) 716444
55.1%
2025-01-23T18:13:59.267961image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
902470
15.9%
. 809332
 
14.2%
e 346518
 
6.1%
n 269566
 
4.7%
a 263758
 
4.6%
r 206774
 
3.6%
l 191074
 
3.4%
o 189434
 
3.3%
i 148627
 
2.6%
t 146239
 
2.6%
Other values (83) 2208452
38.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2583457
45.5%
Uppercase Letter 1299089
22.9%
Space Separator 902470
 
15.9%
Other Punctuation 889586
 
15.7%
Dash Punctuation 7497
 
0.1%
Decimal Number 137
 
< 0.1%
Close Punctuation 4
 
< 0.1%
Open Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 346518
13.4%
n 269566
10.4%
a 263758
10.2%
r 206774
 
8.0%
l 191074
 
7.4%
o 189434
 
7.3%
i 148627
 
5.8%
t 146239
 
5.7%
s 128093
 
5.0%
u 109950
 
4.3%
Other values (31) 583424
22.6%
Uppercase Letter
ValueCountFrequency (%)
M 112265
 
8.6%
H 98725
 
7.6%
S 94549
 
7.3%
N 90440
 
7.0%
B 85290
 
6.6%
R 77014
 
5.9%
W 75283
 
5.8%
J 74552
 
5.7%
D 71976
 
5.5%
G 70394
 
5.4%
Other values (22) 448601
34.5%
Decimal Number
ValueCountFrequency (%)
9 51
37.2%
1 30
21.9%
3 21
15.3%
4 13
 
9.5%
5 8
 
5.8%
2 6
 
4.4%
6 4
 
2.9%
8 3
 
2.2%
0 1
 
0.7%
Other Punctuation
ValueCountFrequency (%)
. 809332
91.0%
, 79225
 
8.9%
' 776
 
0.1%
& 208
 
< 0.1%
? 45
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 2
50.0%
] 2
50.0%
Open Punctuation
ValueCountFrequency (%)
[ 2
50.0%
( 2
50.0%
Space Separator
ValueCountFrequency (%)
902470
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7497
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3882546
68.3%
Common 1799698
31.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 346518
 
8.9%
n 269566
 
6.9%
a 263758
 
6.8%
r 206774
 
5.3%
l 191074
 
4.9%
o 189434
 
4.9%
i 148627
 
3.8%
t 146239
 
3.8%
s 128093
 
3.3%
M 112265
 
2.9%
Other values (63) 1880198
48.4%
Common
ValueCountFrequency (%)
902470
50.1%
. 809332
45.0%
, 79225
 
4.4%
- 7497
 
0.4%
' 776
 
< 0.1%
& 208
 
< 0.1%
9 51
 
< 0.1%
? 45
 
< 0.1%
1 30
 
< 0.1%
3 21
 
< 0.1%
Other values (10) 43
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5678612
99.9%
None 3632
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
902470
15.9%
. 809332
 
14.3%
e 346518
 
6.1%
n 269566
 
4.7%
a 263758
 
4.6%
r 206774
 
3.6%
l 191074
 
3.4%
o 189434
 
3.3%
i 148627
 
2.6%
t 146239
 
2.6%
Other values (62) 2204820
38.8%
None
ValueCountFrequency (%)
ñ 1111
30.6%
á 796
21.9%
í 512
14.1%
é 413
 
11.4%
ö 398
 
11.0%
ü 89
 
2.5%
Á 66
 
1.8%
ó 61
 
1.7%
š 54
 
1.5%
ž 40
 
1.1%
Other values (11) 92
 
2.5%

individualCount
Text

Missing 

Distinct464
Distinct (%)0.1%
Missing27356
Missing (%)5.7%
Memory size3.7 MiB
2025-01-23T18:13:59.443320image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length1
Mean length1.084885156
Min length1

Characters and Unicode

Total characters489664
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique156 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row1
3rd row9
4th row1
5th row1
ValueCountFrequency (%)
1 350108
77.6%
2 28563
 
6.3%
3 12237
 
2.7%
4 9108
 
2.0%
5 5286
 
1.2%
6 4103
 
0.9%
20 3566
 
0.8%
10 2994
 
0.7%
7 2772
 
0.6%
8 2402
 
0.5%
Other values (454) 30212
 
6.7%
2025-01-23T18:13:59.676340image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 367327
75.0%
2 39091
 
8.0%
3 18560
 
3.8%
0 17204
 
3.5%
5 14140
 
2.9%
4 13466
 
2.8%
6 7181
 
1.5%
7 5207
 
1.1%
8 4418
 
0.9%
9 3070
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 489664
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 367327
75.0%
2 39091
 
8.0%
3 18560
 
3.8%
0 17204
 
3.5%
5 14140
 
2.9%
4 13466
 
2.8%
6 7181
 
1.5%
7 5207
 
1.1%
8 4418
 
0.9%
9 3070
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Common 489664
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 367327
75.0%
2 39091
 
8.0%
3 18560
 
3.8%
0 17204
 
3.5%
5 14140
 
2.9%
4 13466
 
2.8%
6 7181
 
1.5%
7 5207
 
1.1%
8 4418
 
0.9%
9 3070
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 489664
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 367327
75.0%
2 39091
 
8.0%
3 18560
 
3.8%
0 17204
 
3.5%
5 14140
 
2.9%
4 13466
 
2.8%
6 7181
 
1.5%
7 5207
 
1.1%
8 4418
 
0.9%
9 3070
 
0.6%

sex
Text

Missing 

Distinct36
Distinct (%)< 0.1%
Missing26817
Missing (%)5.6%
Memory size3.7 MiB
2025-01-23T18:13:59.730798image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length28
Median length13
Mean length13.00084755
Min length4

Characters and Unicode

Total characters5874953
Distinct characters34
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)< 0.1%

Sample

1st rowlarva/juvenile
2nd rowadult unsexed
3rd rowadult unsexed
4th rowadult unsexed
5th rowadult unsexed
ValueCountFrequency (%)
adult 404296
42.6%
unsexed 284475
30.0%
female 79291
 
8.4%
male 43499
 
4.6%
43107
 
4.5%
worker 39539
 
4.2%
unknown 33935
 
3.6%
larva/juvenile 10072
 
1.1%
queen 3567
 
0.4%
nymph 2891
 
0.3%
Other values (13) 4390
 
0.5%
2025-01-23T18:13:59.844712image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 843204
14.4%
u 736758
12.5%
d 690186
11.7%
a 554688
9.4%
l 550879
9.4%
497172
8.5%
t 408328
7.0%
n 403206
6.9%
s 284972
 
4.9%
x 284475
 
4.8%
Other values (24) 621085
10.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5324564
90.6%
Space Separator 497172
 
8.5%
Dash Punctuation 43107
 
0.7%
Other Punctuation 10072
 
0.2%
Uppercase Letter 38
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 843204
15.8%
u 736758
13.8%
d 690186
13.0%
a 554688
10.4%
l 550879
10.3%
t 408328
7.7%
n 403206
7.6%
s 284972
 
5.4%
x 284475
 
5.3%
m 125689
 
2.4%
Other values (15) 442179
8.3%
Uppercase Letter
ValueCountFrequency (%)
A 33
86.8%
U 1
 
2.6%
N 1
 
2.6%
K 1
 
2.6%
P 1
 
2.6%
L 1
 
2.6%
Space Separator
ValueCountFrequency (%)
497172
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 43107
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 10072
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5324602
90.6%
Common 550351
 
9.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 843204
15.8%
u 736758
13.8%
d 690186
13.0%
a 554688
10.4%
l 550879
10.3%
t 408328
7.7%
n 403206
7.6%
s 284972
 
5.4%
x 284475
 
5.3%
m 125689
 
2.4%
Other values (21) 442217
8.3%
Common
ValueCountFrequency (%)
497172
90.3%
- 43107
 
7.8%
/ 10072
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5874953
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 843204
14.4%
u 736758
12.5%
d 690186
11.7%
a 554688
9.4%
l 550879
9.4%
497172
8.5%
t 408328
7.0%
n 403206
6.9%
s 284972
 
4.9%
x 284475
 
4.8%
Other values (24) 621085
10.6%

lifeStage
Text

Missing 

Distinct36
Distinct (%)< 0.1%
Missing26817
Missing (%)5.6%
Memory size3.7 MiB
2025-01-23T18:13:59.897898image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length28
Median length13
Mean length13.00084755
Min length4

Characters and Unicode

Total characters5874953
Distinct characters34
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)< 0.1%

Sample

1st rowlarva/juvenile
2nd rowadult unsexed
3rd rowadult unsexed
4th rowadult unsexed
5th rowadult unsexed
ValueCountFrequency (%)
adult 404296
42.6%
unsexed 284475
30.0%
female 79291
 
8.4%
male 43499
 
4.6%
43107
 
4.5%
worker 39539
 
4.2%
unknown 33935
 
3.6%
larva/juvenile 10072
 
1.1%
queen 3567
 
0.4%
nymph 2891
 
0.3%
Other values (13) 4390
 
0.5%
2025-01-23T18:14:00.014281image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 843204
14.4%
u 736758
12.5%
d 690186
11.7%
a 554688
9.4%
l 550879
9.4%
497172
8.5%
t 408328
7.0%
n 403206
6.9%
s 284972
 
4.9%
x 284475
 
4.8%
Other values (24) 621085
10.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5324564
90.6%
Space Separator 497172
 
8.5%
Dash Punctuation 43107
 
0.7%
Other Punctuation 10072
 
0.2%
Uppercase Letter 38
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 843204
15.8%
u 736758
13.8%
d 690186
13.0%
a 554688
10.4%
l 550879
10.3%
t 408328
7.7%
n 403206
7.6%
s 284972
 
5.4%
x 284475
 
5.3%
m 125689
 
2.4%
Other values (15) 442179
8.3%
Uppercase Letter
ValueCountFrequency (%)
A 33
86.8%
U 1
 
2.6%
N 1
 
2.6%
K 1
 
2.6%
P 1
 
2.6%
L 1
 
2.6%
Space Separator
ValueCountFrequency (%)
497172
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 43107
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 10072
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5324602
90.6%
Common 550351
 
9.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 843204
15.8%
u 736758
13.8%
d 690186
13.0%
a 554688
10.4%
l 550879
10.3%
t 408328
7.7%
n 403206
7.6%
s 284972
 
5.4%
x 284475
 
5.3%
m 125689
 
2.4%
Other values (21) 442217
8.3%
Common
ValueCountFrequency (%)
497172
90.3%
- 43107
 
7.8%
/ 10072
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5874953
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 843204
14.4%
u 736758
12.5%
d 690186
11.7%
a 554688
9.4%
l 550879
9.4%
497172
8.5%
t 408328
7.0%
n 403206
6.9%
s 284972
 
4.9%
x 284475
 
4.8%
Other values (24) 621085
10.6%

organismID
Text

Unique 

Distinct478707
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
2025-01-23T18:14:00.336452image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length6.846829062
Min length5

Characters and Unicode

Total characters3277625
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique478707 ?
Unique (%)100.0%

Sample

1st row3494482
2nd row3561334
3rd row2887559
4th row3740024
5th row4097745
ValueCountFrequency (%)
3494482 1
 
< 0.1%
3457661 1
 
< 0.1%
3740024 1
 
< 0.1%
4097745 1
 
< 0.1%
3818175 1
 
< 0.1%
4023155 1
 
< 0.1%
4152505 1
 
< 0.1%
3764893 1
 
< 0.1%
2845228 1
 
< 0.1%
4207023 1
 
< 0.1%
Other values (478697) 478697
> 99.9%
2025-01-23T18:14:00.739292image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 506406
15.5%
4 378270
11.5%
2 341774
10.4%
1 339423
10.4%
8 321519
9.8%
7 293779
9.0%
5 292078
8.9%
0 273078
8.3%
9 271924
8.3%
6 259372
7.9%
Other values (2) 2
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3277623
> 99.9%
Lowercase Letter 1
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 506406
15.5%
4 378270
11.5%
2 341774
10.4%
1 339423
10.4%
8 321519
9.8%
7 293779
9.0%
5 292078
8.9%
0 273078
8.3%
9 271924
8.3%
6 259372
7.9%
Lowercase Letter
ValueCountFrequency (%)
e 1
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3277624
> 99.9%
Latin 1
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
3 506406
15.5%
4 378270
11.5%
2 341774
10.4%
1 339423
10.4%
8 321519
9.8%
7 293779
9.0%
5 292078
8.9%
0 273078
8.3%
9 271924
8.3%
6 259372
7.9%
Latin
ValueCountFrequency (%)
e 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3277625
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 506406
15.5%
4 378270
11.5%
2 341774
10.4%
1 339423
10.4%
8 321519
9.8%
7 293779
9.0%
5 292078
8.9%
0 273078
8.3%
9 271924
8.3%
6 259372
7.9%
Other values (2) 2
 
< 0.1%

fieldNumber
Text

Missing 

Distinct40248
Distinct (%)33.8%
Missing359459
Missing (%)75.1%
Memory size3.7 MiB
2025-01-23T18:14:00.946987image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length43
Median length37
Mean length9.600538374
Min length1

Characters and Unicode

Total characters1144845
Distinct characters84
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23373 ?
Unique (%)19.6%

Sample

1st rowFMHD#83-8537
2nd rowFMHD#95-108
3rd rowFMHD#76-723
4th rowFMHD#97-422
5th rowStr-902
ValueCountFrequency (%)
svp 10570
 
7.1%
tk 2777
 
1.9%
no 1478
 
1.0%
data 1469
 
1.0%
qcaz 636
 
0.4%
str-19 582
 
0.4%
mt 499
 
0.3%
cwd 486
 
0.3%
mrg 390
 
0.3%
smg 365
 
0.2%
Other values (36928) 128726
87.0%
2025-01-23T18:14:01.220158image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 84307
 
7.4%
1 76151
 
6.7%
0 71187
 
6.2%
2 70507
 
6.2%
M 66975
 
5.9%
8 61209
 
5.3%
D 57041
 
5.0%
H 56631
 
4.9%
F 55559
 
4.9%
# 55366
 
4.8%
Other values (74) 489912
42.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 551673
48.2%
Uppercase Letter 347344
30.3%
Dash Punctuation 84307
 
7.4%
Lowercase Letter 70216
 
6.1%
Other Punctuation 60749
 
5.3%
Space Separator 28742
 
2.5%
Connector Punctuation 1751
 
0.2%
Open Punctuation 26
 
< 0.1%
Close Punctuation 26
 
< 0.1%
Math Symbol 10
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 22881
32.6%
r 22297
31.8%
a 4905
 
7.0%
o 4216
 
6.0%
n 3160
 
4.5%
e 2816
 
4.0%
d 2404
 
3.4%
u 1987
 
2.8%
g 905
 
1.3%
i 786
 
1.1%
Other values (17) 3859
 
5.5%
Uppercase Letter
ValueCountFrequency (%)
M 66975
19.3%
D 57041
16.4%
H 56631
16.3%
F 55559
16.0%
S 36644
10.5%
P 13003
 
3.7%
V 11354
 
3.3%
C 10410
 
3.0%
J 6714
 
1.9%
T 4859
 
1.4%
Other values (16) 28154
8.1%
Decimal Number
ValueCountFrequency (%)
1 76151
13.8%
0 71187
12.9%
2 70507
12.8%
8 61209
11.1%
3 53636
9.7%
7 51741
9.4%
9 45997
8.3%
4 41811
7.6%
6 40753
7.4%
5 38681
7.0%
Other Punctuation
ValueCountFrequency (%)
# 55366
91.1%
. 3262
 
5.4%
/ 1622
 
2.7%
& 226
 
0.4%
, 141
 
0.2%
: 52
 
0.1%
? 28
 
< 0.1%
; 27
 
< 0.1%
" 24
 
< 0.1%
' 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
28737
> 99.9%
  5
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 24
92.3%
[ 2
 
7.7%
Close Punctuation
ValueCountFrequency (%)
) 24
92.3%
] 2
 
7.7%
Math Symbol
ValueCountFrequency (%)
+ 9
90.0%
= 1
 
10.0%
Dash Punctuation
ValueCountFrequency (%)
- 84307
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1751
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 727285
63.5%
Latin 417560
36.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 66975
16.0%
D 57041
13.7%
H 56631
13.6%
F 55559
13.3%
S 36644
8.8%
t 22881
 
5.5%
r 22297
 
5.3%
P 13003
 
3.1%
V 11354
 
2.7%
C 10410
 
2.5%
Other values (43) 64765
15.5%
Common
ValueCountFrequency (%)
- 84307
11.6%
1 76151
10.5%
0 71187
9.8%
2 70507
9.7%
8 61209
8.4%
# 55366
7.6%
3 53636
7.4%
7 51741
7.1%
9 45997
 
6.3%
4 41811
 
5.7%
Other values (21) 115373
15.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1144838
> 99.9%
None 7
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 84307
 
7.4%
1 76151
 
6.7%
0 71187
 
6.2%
2 70507
 
6.2%
M 66975
 
5.9%
8 61209
 
5.3%
D 57041
 
5.0%
H 56631
 
4.9%
F 55559
 
4.9%
# 55366
 
4.8%
Other values (71) 489905
42.8%
None
ValueCountFrequency (%)
  5
71.4%
ö 1
 
14.3%
´ 1
 
14.3%

eventDate
Text

Missing 

Distinct28974
Distinct (%)7.0%
Missing65321
Missing (%)13.6%
Memory size3.7 MiB
2025-01-23T18:14:01.436478image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length9.773303885
Min length3

Characters and Unicode

Total characters4040147
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5923 ?
Unique (%)1.4%

Sample

1st row1983-10-28
2nd row1983-10-08
3rd row1995-12-13
4th row-05-04
5th row1976-02-09
ValueCountFrequency (%)
1955-06-12 1141
 
0.3%
1968-06-24 1082
 
0.3%
1955-06-19 935
 
0.2%
1960-07-23 653
 
0.2%
1981-06-08 638
 
0.2%
1952-08-04 625
 
0.2%
1951-09-04 617
 
0.1%
1952-09-02 606
 
0.1%
1952-06-12 567
 
0.1%
1971-08-13 556
 
0.1%
Other values (28964) 405966
98.2%
2025-01-23T18:14:01.709316image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 801180
19.8%
1 686429
17.0%
0 652821
16.2%
9 512192
12.7%
2 287677
 
7.1%
6 221293
 
5.5%
7 215032
 
5.3%
8 192781
 
4.8%
5 180428
 
4.5%
4 150391
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3238967
80.2%
Dash Punctuation 801180
 
19.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 686429
21.2%
0 652821
20.2%
9 512192
15.8%
2 287677
8.9%
6 221293
 
6.8%
7 215032
 
6.6%
8 192781
 
6.0%
5 180428
 
5.6%
4 150391
 
4.6%
3 139923
 
4.3%
Dash Punctuation
ValueCountFrequency (%)
- 801180
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4040147
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 801180
19.8%
1 686429
17.0%
0 652821
16.2%
9 512192
12.7%
2 287677
 
7.1%
6 221293
 
5.5%
7 215032
 
5.3%
8 192781
 
4.8%
5 180428
 
4.5%
4 150391
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4040147
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 801180
19.8%
1 686429
17.0%
0 652821
16.2%
9 512192
12.7%
2 287677
 
7.1%
6 221293
 
5.5%
7 215032
 
5.3%
8 192781
 
4.8%
5 180428
 
4.5%
4 150391
 
3.7%

eventTime
Text

Missing 

Distinct163
Distinct (%)6.7%
Missing476277
Missing (%)99.5%
Memory size3.7 MiB
2025-01-23T18:14:01.858232image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length3.621399177
Min length1

Characters and Unicode

Total characters8800
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)1.1%

Sample

1st row9.9
2nd row9.9
3rd row8.8
4th row17.5
5th row23.1
ValueCountFrequency (%)
9.5 118
 
4.9%
13.5 101
 
4.2%
10 91
 
3.7%
20.5 90
 
3.7%
19.8 88
 
3.6%
18.5 83
 
3.4%
9.6 78
 
3.2%
20.3 68
 
2.8%
11.6 63
 
2.6%
8.8 56
 
2.3%
Other values (153) 1594
65.6%
2025-01-23T18:14:02.067481image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 2322
26.4%
1 1411
16.0%
2 1147
13.0%
5 871
 
9.9%
8 681
 
7.7%
9 679
 
7.7%
3 620
 
7.0%
0 420
 
4.8%
6 332
 
3.8%
7 176
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6478
73.6%
Other Punctuation 2322
 
26.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1411
21.8%
2 1147
17.7%
5 871
13.4%
8 681
10.5%
9 679
10.5%
3 620
9.6%
0 420
 
6.5%
6 332
 
5.1%
7 176
 
2.7%
4 141
 
2.2%
Other Punctuation
ValueCountFrequency (%)
. 2322
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8800
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 2322
26.4%
1 1411
16.0%
2 1147
13.0%
5 871
 
9.9%
8 681
 
7.7%
9 679
 
7.7%
3 620
 
7.0%
0 420
 
4.8%
6 332
 
3.8%
7 176
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 2322
26.4%
1 1411
16.0%
2 1147
13.0%
5 871
 
9.9%
8 681
 
7.7%
9 679
 
7.7%
3 620
 
7.0%
0 420
 
4.8%
6 332
 
3.8%
7 176
 
2.0%

startDayOfYear
Text

Missing 

Distinct368
Distinct (%)0.1%
Missing55118
Missing (%)11.5%
Memory size3.7 MiB
2025-01-23T18:14:02.275384image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.859177174
Min length1

Characters and Unicode

Total characters1211116
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row301
2nd row281
3rd row347
4th row125
5th row40
ValueCountFrequency (%)
173 4314
 
1.0%
163 3624
 
0.9%
181 3559
 
0.8%
213 3505
 
0.8%
205 3482
 
0.8%
212 3456
 
0.8%
182 3429
 
0.8%
172 3360
 
0.8%
176 3351
 
0.8%
170 3347
 
0.8%
Other values (358) 388162
91.6%
2025-01-23T18:14:02.556354image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 268051
22.1%
2 240013
19.8%
3 118788
9.8%
5 88129
 
7.3%
6 87334
 
7.2%
7 84268
 
7.0%
4 82896
 
6.8%
0 82447
 
6.8%
9 79989
 
6.6%
8 79201
 
6.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1211116
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 268051
22.1%
2 240013
19.8%
3 118788
9.8%
5 88129
 
7.3%
6 87334
 
7.2%
7 84268
 
7.0%
4 82896
 
6.8%
0 82447
 
6.8%
9 79989
 
6.6%
8 79201
 
6.5%

Most occurring scripts

ValueCountFrequency (%)
Common 1211116
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 268051
22.1%
2 240013
19.8%
3 118788
9.8%
5 88129
 
7.3%
6 87334
 
7.2%
7 84268
 
7.0%
4 82896
 
6.8%
0 82447
 
6.8%
9 79989
 
6.6%
8 79201
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1211116
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 268051
22.1%
2 240013
19.8%
3 118788
9.8%
5 88129
 
7.3%
6 87334
 
7.2%
7 84268
 
7.0%
4 82896
 
6.8%
0 82447
 
6.8%
9 79989
 
6.6%
8 79201
 
6.5%

endDayOfYear
Text

Missing 

Distinct367
Distinct (%)0.1%
Missing55117
Missing (%)11.5%
Memory size3.7 MiB
2025-01-23T18:14:02.770543image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.859840884
Min length1

Characters and Unicode

Total characters1211400
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row301
2nd row281
3rd row347
4th row125
5th row47
ValueCountFrequency (%)
170 4592
 
1.1%
173 3957
 
0.9%
212 3748
 
0.9%
181 3644
 
0.9%
205 3448
 
0.8%
172 3398
 
0.8%
213 3397
 
0.8%
176 3380
 
0.8%
195 3322
 
0.8%
169 3183
 
0.8%
Other values (357) 387521
91.5%
2025-01-23T18:14:03.056472image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 265098
21.9%
2 241182
19.9%
3 117592
9.7%
5 88486
 
7.3%
7 85737
 
7.1%
6 85097
 
7.0%
0 84435
 
7.0%
4 82715
 
6.8%
9 80722
 
6.7%
8 80336
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1211400
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 265098
21.9%
2 241182
19.9%
3 117592
9.7%
5 88486
 
7.3%
7 85737
 
7.1%
6 85097
 
7.0%
0 84435
 
7.0%
4 82715
 
6.8%
9 80722
 
6.7%
8 80336
 
6.6%

Most occurring scripts

ValueCountFrequency (%)
Common 1211400
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 265098
21.9%
2 241182
19.9%
3 117592
9.7%
5 88486
 
7.3%
7 85737
 
7.1%
6 85097
 
7.0%
0 84435
 
7.0%
4 82715
 
6.8%
9 80722
 
6.7%
8 80336
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1211400
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 265098
21.9%
2 241182
19.9%
3 117592
9.7%
5 88486
 
7.3%
7 85737
 
7.1%
6 85097
 
7.0%
0 84435
 
7.0%
4 82715
 
6.8%
9 80722
 
6.7%
8 80336
 
6.6%

year
Text

Missing 

Distinct181
Distinct (%)< 0.1%
Missing69555
Missing (%)14.5%
Memory size3.7 MiB
2025-01-23T18:14:03.234282image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length3.999997556
Min length3

Characters and Unicode

Total characters1636607
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)< 0.1%

Sample

1st row1983
2nd row1983
3rd row1995
4th row1976
5th row1966
ValueCountFrequency (%)
1952 13893
 
3.4%
1959 11953
 
2.9%
1968 11263
 
2.8%
1967 10240
 
2.5%
1965 9595
 
2.3%
1946 9020
 
2.2%
1941 8717
 
2.1%
1966 7298
 
1.8%
1948 7199
 
1.8%
1971 7090
 
1.7%
Other values (171) 312884
76.5%
2025-01-23T18:14:03.468330image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 440645
26.9%
1 436536
26.7%
0 112706
 
6.9%
6 109706
 
6.7%
5 102177
 
6.2%
7 100440
 
6.1%
2 93748
 
5.7%
8 90360
 
5.5%
4 86084
 
5.3%
3 64205
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1636607
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 440645
26.9%
1 436536
26.7%
0 112706
 
6.9%
6 109706
 
6.7%
5 102177
 
6.2%
7 100440
 
6.1%
2 93748
 
5.7%
8 90360
 
5.5%
4 86084
 
5.3%
3 64205
 
3.9%

Most occurring scripts

ValueCountFrequency (%)
Common 1636607
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9 440645
26.9%
1 436536
26.7%
0 112706
 
6.9%
6 109706
 
6.7%
5 102177
 
6.2%
7 100440
 
6.1%
2 93748
 
5.7%
8 90360
 
5.5%
4 86084
 
5.3%
3 64205
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1636607
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 440645
26.9%
1 436536
26.7%
0 112706
 
6.9%
6 109706
 
6.7%
5 102177
 
6.2%
7 100440
 
6.1%
2 93748
 
5.7%
8 90360
 
5.5%
4 86084
 
5.3%
3 64205
 
3.9%

month
Text

Missing 

Distinct13
Distinct (%)< 0.1%
Missing69683
Missing (%)14.6%
Memory size3.7 MiB
2025-01-23T18:14:03.531633image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.111638926
Min length1

Characters and Unicode

Total characters454687
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row10
2nd row10
3rd row12
4th row5
5th row2
ValueCountFrequency (%)
7 76625
18.7%
6 75091
18.4%
8 62882
15.4%
5 40603
9.9%
9 35445
8.7%
4 25293
 
6.2%
3 21614
 
5.3%
10 20707
 
5.1%
2 12917
 
3.2%
1 12891
 
3.2%
Other values (3) 24956
 
6.1%
2025-01-23T18:14:03.640368image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 76625
16.9%
6 75091
16.5%
1 70695
15.5%
8 62882
13.8%
5 40603
8.9%
9 35445
7.8%
2 25728
 
5.7%
4 25295
 
5.6%
3 21614
 
4.8%
0 20709
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 454687
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 76625
16.9%
6 75091
16.5%
1 70695
15.5%
8 62882
13.8%
5 40603
8.9%
9 35445
7.8%
2 25728
 
5.7%
4 25295
 
5.6%
3 21614
 
4.8%
0 20709
 
4.6%

Most occurring scripts

ValueCountFrequency (%)
Common 454687
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
7 76625
16.9%
6 75091
16.5%
1 70695
15.5%
8 62882
13.8%
5 40603
8.9%
9 35445
7.8%
2 25728
 
5.7%
4 25295
 
5.6%
3 21614
 
4.8%
0 20709
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 454687
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 76625
16.9%
6 75091
16.5%
1 70695
15.5%
8 62882
13.8%
5 40603
8.9%
9 35445
7.8%
2 25728
 
5.7%
4 25295
 
5.6%
3 21614
 
4.8%
0 20709
 
4.6%

day
Text

Missing 

Distinct31
Distinct (%)< 0.1%
Missing86551
Missing (%)18.1%
Memory size3.7 MiB
2025-01-23T18:14:03.708632image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length1.700568141
Min length1

Characters and Unicode

Total characters666888
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row28
2nd row8
3rd row13
4th row4
5th row9
ValueCountFrequency (%)
1 15997
 
4.1%
20 14838
 
3.8%
22 14713
 
3.8%
15 14419
 
3.7%
18 13855
 
3.5%
8 13849
 
3.5%
21 13516
 
3.4%
12 13507
 
3.4%
23 13476
 
3.4%
9 13283
 
3.4%
Other values (21) 250703
63.9%
2025-01-23T18:14:03.839236image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 179198
26.9%
2 168201
25.2%
3 54104
 
8.1%
8 39539
 
5.9%
4 39012
 
5.8%
0 38621
 
5.8%
7 37967
 
5.7%
5 37648
 
5.6%
6 36496
 
5.5%
9 36102
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 666888
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 179198
26.9%
2 168201
25.2%
3 54104
 
8.1%
8 39539
 
5.9%
4 39012
 
5.8%
0 38621
 
5.8%
7 37967
 
5.7%
5 37648
 
5.6%
6 36496
 
5.5%
9 36102
 
5.4%

Most occurring scripts

ValueCountFrequency (%)
Common 666888
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 179198
26.9%
2 168201
25.2%
3 54104
 
8.1%
8 39539
 
5.9%
4 39012
 
5.8%
0 38621
 
5.8%
7 37967
 
5.7%
5 37648
 
5.6%
6 36496
 
5.5%
9 36102
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 666888
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 179198
26.9%
2 168201
25.2%
3 54104
 
8.1%
8 39539
 
5.9%
4 39012
 
5.8%
0 38621
 
5.8%
7 37967
 
5.7%
5 37648
 
5.6%
6 36496
 
5.5%
9 36102
 
5.4%

habitat
Text

Missing 

Distinct5563
Distinct (%)6.7%
Missing395207
Missing (%)82.6%
Memory size3.7 MiB
2025-01-23T18:14:04.025370image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4194
Median length176
Mean length22.20292216
Min length1

Characters and Unicode

Total characters1853944
Distinct characters94
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2144 ?
Unique (%)2.6%

Sample

1st rowtropical evergreen forest
2nd rowpine-aspen forest
3rd rowoak savanna on sand ridge
4th rowrainforest
5th rowboggy mixed forest remnant
ValueCountFrequency (%)
forest 34162
 
12.9%
tropical 10262
 
3.9%
woodland 7543
 
2.8%
oak 6996
 
2.6%
humid 6507
 
2.5%
understory 5043
 
1.9%
dry 4923
 
1.9%
mixed 4820
 
1.8%
shrubby 4533
 
1.7%
rainforest 4401
 
1.7%
Other values (3515) 175710
66.3%
2025-01-23T18:14:04.291393image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
181392
 
9.8%
o 163729
 
8.8%
r 161792
 
8.7%
e 160751
 
8.7%
a 138399
 
7.5%
s 114769
 
6.2%
t 107847
 
5.8%
n 94754
 
5.1%
d 93709
 
5.1%
i 88210
 
4.8%
Other values (84) 548592
29.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1604783
86.6%
Space Separator 181392
 
9.8%
Other Punctuation 26427
 
1.4%
Uppercase Letter 21290
 
1.1%
Dash Punctuation 10926
 
0.6%
Decimal Number 3263
 
0.2%
Close Punctuation 2154
 
0.1%
Open Punctuation 2110
 
0.1%
Control 769
 
< 0.1%
Math Symbol 435
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 163729
 
10.2%
r 161792
 
10.1%
e 160751
 
10.0%
a 138399
 
8.6%
s 114769
 
7.2%
t 107847
 
6.7%
n 94754
 
5.9%
d 93709
 
5.8%
i 88210
 
5.5%
l 67277
 
4.2%
Other values (19) 413546
25.8%
Uppercase Letter
ValueCountFrequency (%)
P 2544
11.9%
Q 2066
 
9.7%
A 1942
 
9.1%
N 1743
 
8.2%
E 1742
 
8.2%
C 1567
 
7.4%
S 1309
 
6.1%
M 1219
 
5.7%
L 915
 
4.3%
R 856
 
4.0%
Other values (16) 5387
25.3%
Other Punctuation
ValueCountFrequency (%)
, 16457
62.3%
. 4108
 
15.5%
/ 2002
 
7.6%
; 1847
 
7.0%
" 881
 
3.3%
& 571
 
2.2%
' 261
 
1.0%
? 147
 
0.6%
# 95
 
0.4%
: 33
 
0.1%
Other values (3) 25
 
0.1%
Decimal Number
ValueCountFrequency (%)
2 860
26.4%
1 679
20.8%
0 562
17.2%
5 429
13.1%
3 289
 
8.9%
4 129
 
4.0%
9 109
 
3.3%
8 74
 
2.3%
6 67
 
2.1%
7 65
 
2.0%
Math Symbol
ValueCountFrequency (%)
+ 197
45.3%
~ 157
36.1%
± 72
 
16.6%
> 9
 
2.1%
Close Punctuation
ValueCountFrequency (%)
) 2136
99.2%
] 11
 
0.5%
} 7
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 2090
99.1%
[ 13
 
0.6%
{ 7
 
0.3%
Control
ValueCountFrequency (%)
765
99.5%
4
 
0.5%
Other Symbol
ValueCountFrequency (%)
¦ 357
90.4%
° 38
 
9.6%
Space Separator
ValueCountFrequency (%)
181392
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10926
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1626073
87.7%
Common 227871
 
12.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 163729
 
10.1%
r 161792
 
9.9%
e 160751
 
9.9%
a 138399
 
8.5%
s 114769
 
7.1%
t 107847
 
6.6%
n 94754
 
5.8%
d 93709
 
5.8%
i 88210
 
5.4%
l 67277
 
4.1%
Other values (45) 434836
26.7%
Common
ValueCountFrequency (%)
181392
79.6%
, 16457
 
7.2%
- 10926
 
4.8%
. 4108
 
1.8%
) 2136
 
0.9%
( 2090
 
0.9%
/ 2002
 
0.9%
; 1847
 
0.8%
" 881
 
0.4%
2 860
 
0.4%
Other values (29) 5172
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1853349
> 99.9%
None 595
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
181392
 
9.8%
o 163729
 
8.8%
r 161792
 
8.7%
e 160751
 
8.7%
a 138399
 
7.5%
s 114769
 
6.2%
t 107847
 
5.8%
n 94754
 
5.1%
d 93709
 
5.1%
i 88210
 
4.8%
Other values (78) 547997
29.6%
None
ValueCountFrequency (%)
¦ 357
60.0%
ñ 126
 
21.2%
± 72
 
12.1%
° 38
 
6.4%
í 1
 
0.2%
á 1
 
0.2%

samplingProtocol
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing478706
Missing (%)> 99.9%
Memory size3.7 MiB
2025-01-23T18:14:04.347644image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters13
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowSouth America
ValueCountFrequency (%)
south 1
50.0%
america 1
50.0%
2025-01-23T18:14:04.445923image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 1
 
7.7%
o 1
 
7.7%
u 1
 
7.7%
t 1
 
7.7%
h 1
 
7.7%
1
 
7.7%
A 1
 
7.7%
m 1
 
7.7%
e 1
 
7.7%
r 1
 
7.7%
Other values (3) 3
23.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10
76.9%
Uppercase Letter 2
 
15.4%
Space Separator 1
 
7.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 1
10.0%
u 1
10.0%
t 1
10.0%
h 1
10.0%
m 1
10.0%
e 1
10.0%
r 1
10.0%
i 1
10.0%
c 1
10.0%
a 1
10.0%
Uppercase Letter
ValueCountFrequency (%)
S 1
50.0%
A 1
50.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12
92.3%
Common 1
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 1
8.3%
o 1
8.3%
u 1
8.3%
t 1
8.3%
h 1
8.3%
A 1
8.3%
m 1
8.3%
e 1
8.3%
r 1
8.3%
i 1
8.3%
Other values (2) 2
16.7%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 1
 
7.7%
o 1
 
7.7%
u 1
 
7.7%
t 1
 
7.7%
h 1
 
7.7%
1
 
7.7%
A 1
 
7.7%
m 1
 
7.7%
e 1
 
7.7%
r 1
 
7.7%
Other values (3) 3
23.1%

fieldNotes
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing478706
Missing (%)> 99.9%
Memory size3.7 MiB
2025-01-23T18:14:04.490233image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowBolivia
ValueCountFrequency (%)
bolivia 1
100.0%
2025-01-23T18:14:04.587678image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 2
28.6%
B 1
14.3%
o 1
14.3%
l 1
14.3%
v 1
14.3%
a 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6
85.7%
Uppercase Letter 1
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 2
33.3%
o 1
16.7%
l 1
16.7%
v 1
16.7%
a 1
16.7%
Uppercase Letter
ValueCountFrequency (%)
B 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 2
28.6%
B 1
14.3%
o 1
14.3%
l 1
14.3%
v 1
14.3%
a 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 2
28.6%
B 1
14.3%
o 1
14.3%
l 1
14.3%
v 1
14.3%
a 1
14.3%

locationID
Text

Missing 

Distinct52105
Distinct (%)11.8%
Missing35666
Missing (%)7.5%
Memory size3.7 MiB
2025-01-23T18:14:04.712898image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length36
Mean length35.99994131
Min length10

Characters and Unicode

Total characters15949450
Distinct characters25
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20805 ?
Unique (%)4.7%

Sample

1st rowa93ad646-f4ef-47b4-aa14-e2d191939c8b
2nd row940a9125-1b89-4969-8c59-5521f8e6743c
3rd row9b2e4862-bf3f-41ac-83bc-d22285dcd7d8
4th rowcbda2c2e-536f-4576-914f-e1623f8b962c
5th rowb73ebdf7-7c5c-4796-8afb-1290c21f8d5a
ValueCountFrequency (%)
9fb411f7-eacc-47f5-bef3-f28f598c847b 3362
 
0.8%
b2db85bb-6338-4222-a8f3-bb95aeea12ba 3349
 
0.8%
1ad00e89-8523-4234-8871-32f6b284cba2 2428
 
0.5%
c2a285f4-67a5-44dc-8ddf-e4eaf18b753d 1860
 
0.4%
7319d49e-73af-4b41-b632-fb6ad2c90629 1835
 
0.4%
bdae3d4a-f6d9-4b66-b87c-af6792b674ee 1834
 
0.4%
24e8f5cf-8edf-4656-b5f0-1e7f0e1df6e9 1497
 
0.3%
8d03548a-339b-4488-b4e8-6cfc5b49efb7 1370
 
0.3%
28bdb2cb-1dfa-4398-8f19-bb9125f9be63 1330
 
0.3%
7b507317-4a27-4d7c-b5fb-af8673ce7d3e 1251
 
0.3%
Other values (52096) 422926
95.5%
2025-01-23T18:14:04.904920image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 1772160
 
11.1%
4 1281446
 
8.0%
b 978000
 
6.1%
9 952115
 
6.0%
a 940472
 
5.9%
8 940017
 
5.9%
2 858138
 
5.4%
e 845026
 
5.3%
f 840002
 
5.3%
3 832818
 
5.2%
Other values (15) 5709256
35.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8946227
56.1%
Lowercase Letter 5231060
32.8%
Dash Punctuation 1772160
 
11.1%
Uppercase Letter 2
 
< 0.1%
Space Separator 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
b 978000
18.7%
a 940472
18.0%
e 845026
16.2%
f 840002
16.1%
d 820724
15.7%
c 806831
15.4%
n 1
 
< 0.1%
t 1
 
< 0.1%
r 1
 
< 0.1%
u 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
4 1281446
14.3%
9 952115
10.6%
8 940017
10.5%
2 858138
9.6%
3 832818
9.3%
5 828339
9.3%
1 823094
9.2%
7 822446
9.2%
6 811756
9.1%
0 796058
8.9%
Uppercase Letter
ValueCountFrequency (%)
S 1
50.0%
C 1
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 1772160
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10718388
67.2%
Latin 5231062
32.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
b 978000
18.7%
a 940472
18.0%
e 845026
16.2%
f 840002
16.1%
d 820724
15.7%
c 806831
15.4%
S 1
 
< 0.1%
n 1
 
< 0.1%
t 1
 
< 0.1%
C 1
 
< 0.1%
Other values (3) 3
 
< 0.1%
Common
ValueCountFrequency (%)
- 1772160
16.5%
4 1281446
12.0%
9 952115
8.9%
8 940017
8.8%
2 858138
8.0%
3 832818
7.8%
5 828339
7.7%
1 823094
7.7%
7 822446
7.7%
6 811756
7.6%
Other values (2) 796059
7.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15949450
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 1772160
 
11.1%
4 1281446
 
8.0%
b 978000
 
6.1%
9 952115
 
6.0%
a 940472
 
5.9%
8 940017
 
5.9%
2 858138
 
5.4%
e 845026
 
5.3%
f 840002
 
5.3%
3 832818
 
5.2%
Other values (15) 5709256
35.8%

higherGeographyID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing478706
Missing (%)> 99.9%
Memory size3.7 MiB
2025-01-23T18:14:04.960427image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters6
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowIchilo
ValueCountFrequency (%)
ichilo 1
100.0%
2025-01-23T18:14:05.056251image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 1
16.7%
c 1
16.7%
h 1
16.7%
i 1
16.7%
l 1
16.7%
o 1
16.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5
83.3%
Uppercase Letter 1
 
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 1
20.0%
h 1
20.0%
i 1
20.0%
l 1
20.0%
o 1
20.0%
Uppercase Letter
ValueCountFrequency (%)
I 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 1
16.7%
c 1
16.7%
h 1
16.7%
i 1
16.7%
l 1
16.7%
o 1
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 1
16.7%
c 1
16.7%
h 1
16.7%
i 1
16.7%
l 1
16.7%
o 1
16.7%

higherGeography
Text

Missing 

Distinct20183
Distinct (%)12.4%
Missing315671
Missing (%)65.9%
Memory size3.7 MiB
2025-01-23T18:14:05.240647image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length197
Median length139
Mean length61.38621531
Min length2

Characters and Unicode

Total characters10008163
Distinct characters147
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8506 ?
Unique (%)5.2%

Sample

1st row[TRS]
2nd rowNorth America, Santiago, West Indies; Greater Antilles, Cuba, 200m: 16 km NE Caney. [LL]
3rd rowNorth America, Panama, Panamá, Canal Zone, Barro Colorado, 300ft: Fairchild Trail 15.3. [LL]
4th rowNorth America, USA, Illinois, Cook, Calumet City, Green Lake Woods, site 1. [LL]
5th rowSouth America, Argentina, Buenos Aires
ValueCountFrequency (%)
america 112078
 
7.6%
north 75561
 
5.1%
ll 56216
 
3.8%
usa 45309
 
3.1%
south 41956
 
2.8%
of 21084
 
1.4%
km 17973
 
1.2%
mi 13429
 
0.9%
asia 13145
 
0.9%
europe 12546
 
0.8%
Other values (19493) 1067246
72.3%
2025-01-23T18:14:05.524664image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1313951
 
13.1%
a 861471
 
8.6%
, 583468
 
5.8%
e 549514
 
5.5%
o 548496
 
5.5%
r 538420
 
5.4%
i 533979
 
5.3%
n 382626
 
3.8%
t 333705
 
3.3%
l 291404
 
2.9%
Other values (137) 4071129
40.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5935313
59.3%
Uppercase Letter 1575691
 
15.7%
Space Separator 1313952
 
13.1%
Other Punctuation 751940
 
7.5%
Decimal Number 287936
 
2.9%
Open Punctuation 66269
 
0.7%
Close Punctuation 66180
 
0.7%
Dash Punctuation 9761
 
0.1%
Other Symbol 596
 
< 0.1%
Math Symbol 457
 
< 0.1%
Other values (2) 68
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 861471
14.5%
e 549514
9.3%
o 548496
9.2%
r 538420
 
9.1%
i 533979
 
9.0%
n 382626
 
6.4%
t 333705
 
5.6%
l 291404
 
4.9%
m 272081
 
4.6%
c 253739
 
4.3%
Other values (60) 1369878
23.1%
Uppercase Letter
ValueCountFrequency (%)
A 248415
15.8%
S 177938
11.3%
L 158426
 
10.1%
N 144189
 
9.2%
C 114160
 
7.2%
P 97676
 
6.2%
E 64887
 
4.1%
M 60873
 
3.9%
R 55593
 
3.5%
U 53684
 
3.4%
Other values (31) 399850
25.4%
Other Punctuation
ValueCountFrequency (%)
, 583468
77.6%
. 90004
 
12.0%
: 60209
 
8.0%
; 13264
 
1.8%
/ 2338
 
0.3%
' 1495
 
0.2%
# 410
 
0.1%
" 261
 
< 0.1%
& 255
 
< 0.1%
? 209
 
< 0.1%
Other values (2) 27
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 80764
28.0%
1 48603
16.9%
2 35105
12.2%
5 35005
12.2%
3 20009
 
6.9%
4 18302
 
6.4%
6 15297
 
5.3%
7 12895
 
4.5%
8 11666
 
4.1%
9 10290
 
3.6%
Math Symbol
ValueCountFrequency (%)
= 416
91.0%
+ 37
 
8.1%
~ 4
 
0.9%
Space Separator
ValueCountFrequency (%)
1313951
> 99.9%
1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
[ 60879
91.9%
( 5390
 
8.1%
Close Punctuation
ValueCountFrequency (%)
] 60878
92.0%
) 5302
 
8.0%
Dash Punctuation
ValueCountFrequency (%)
- 9754
99.9%
7
 
0.1%
Other Symbol
ValueCountFrequency (%)
° 596
100.0%
Other Letter
ValueCountFrequency (%)
º 63
100.0%
Final Punctuation
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7511067
75.0%
Common 2497096
 
25.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 861471
 
11.5%
e 549514
 
7.3%
o 548496
 
7.3%
r 538420
 
7.2%
i 533979
 
7.1%
n 382626
 
5.1%
t 333705
 
4.4%
l 291404
 
3.9%
m 272081
 
3.6%
c 253739
 
3.4%
Other values (102) 2945632
39.2%
Common
ValueCountFrequency (%)
1313951
52.6%
, 583468
23.4%
. 90004
 
3.6%
0 80764
 
3.2%
[ 60879
 
2.4%
] 60878
 
2.4%
: 60209
 
2.4%
1 48603
 
1.9%
2 35105
 
1.4%
5 35005
 
1.4%
Other values (25) 128230
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9961779
99.5%
None 46370
 
0.5%
Punctuation 13
 
< 0.1%
Latin Ext Additional 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1313951
 
13.2%
a 861471
 
8.6%
, 583468
 
5.9%
e 549514
 
5.5%
o 548496
 
5.5%
r 538420
 
5.4%
i 533979
 
5.4%
n 382626
 
3.8%
t 333705
 
3.3%
l 291404
 
2.9%
Other values (73) 4024745
40.4%
None
ValueCountFrequency (%)
é 12676
27.3%
á 11527
24.9%
í 7095
15.3%
ú 4821
 
10.4%
ó 3277
 
7.1%
ã 1369
 
3.0%
ô 1345
 
2.9%
ñ 614
 
1.3%
° 596
 
1.3%
ü 440
 
0.9%
Other values (50) 2610
 
5.6%
Punctuation
ValueCountFrequency (%)
7
53.8%
5
38.5%
1
 
7.7%
Latin Ext Additional
ValueCountFrequency (%)
1
100.0%

continent
Text

Missing 

Distinct12
Distinct (%)< 0.1%
Missing25889
Missing (%)5.4%
Memory size3.7 MiB
2025-01-23T18:14:05.586251image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length13
Mean length12.15688201
Min length4

Characters and Unicode

Total characters5504855
Distinct characters36
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNorth America
2nd rowNorth America
3rd rowNorth America
4th rowNorth America
5th rowNorth America
ValueCountFrequency (%)
america 399192
46.9%
north 355344
41.7%
south 43848
 
5.1%
asia 14837
 
1.7%
europe 13822
 
1.6%
africa 11697
 
1.4%
oceania 11057
 
1.3%
unknown/none 1957
 
0.2%
unknown 234
 
< 0.1%
antarctica 20
 
< 0.1%
Other values (8) 22
 
< 0.1%
2025-01-23T18:14:05.827570image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 780093
14.2%
a 447904
8.1%
i 436810
7.9%
e 426043
7.7%
A 425746
7.7%
c 421986
7.7%
o 417175
7.6%
t 399252
7.3%
399212
7.3%
m 399193
7.3%
Other values (26) 951441
17.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4249704
77.2%
Uppercase Letter 853981
 
15.5%
Space Separator 399212
 
7.3%
Other Punctuation 1958
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 780093
18.4%
a 447904
10.5%
i 436810
10.3%
e 426043
10.0%
c 421986
9.9%
o 417175
9.8%
t 399252
9.4%
m 399193
9.4%
h 399192
9.4%
u 57671
 
1.4%
Other values (11) 64385
 
1.5%
Uppercase Letter
ValueCountFrequency (%)
A 425746
49.9%
N 357301
41.8%
S 43848
 
5.1%
E 13822
 
1.6%
O 11057
 
1.3%
U 2191
 
0.3%
I 6
 
< 0.1%
W 6
 
< 0.1%
C 1
 
< 0.1%
R 1
 
< 0.1%
Other values (2) 2
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
/ 1957
99.9%
, 1
 
0.1%
Space Separator
ValueCountFrequency (%)
399212
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5103685
92.7%
Common 401170
 
7.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 780093
15.3%
a 447904
8.8%
i 436810
8.6%
e 426043
8.3%
A 425746
8.3%
c 421986
8.3%
o 417175
8.2%
t 399252
7.8%
m 399193
7.8%
h 399192
7.8%
Other values (23) 550291
10.8%
Common
ValueCountFrequency (%)
399212
99.5%
/ 1957
 
0.5%
, 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5504854
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 780093
14.2%
a 447904
8.1%
i 436810
7.9%
e 426043
7.7%
A 425746
7.7%
c 421986
7.7%
o 417175
7.6%
t 399252
7.3%
399212
7.3%
m 399193
7.3%
Other values (25) 951440
17.3%
None
ValueCountFrequency (%)
í 1
100.0%

islandGroup
Text

Missing 

Distinct89
Distinct (%)0.6%
Missing464544
Missing (%)97.0%
Memory size3.7 MiB
2025-01-23T18:14:05.910766image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length61
Median length46
Mean length26.86867189
Min length4

Characters and Unicode

Total characters380541
Distinct characters57
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25 ?
Unique (%)0.2%

Sample

1st rowWest Indies; Greater Antilles
2nd rowWest Indies; Lesser Antilles
3rd rowWest Indies; Greater Antilles
4th rowWest Indies; Greater Antilles
5th rowPalau Islands
ValueCountFrequency (%)
islands 8129
16.5%
west 6389
13.0%
indies 6389
13.0%
antilles 5898
12.0%
greater 4711
9.6%
caroline 3814
7.7%
palau 3040
 
6.2%
archipelago 2465
 
5.0%
philippine 2141
 
4.3%
western 1679
 
3.4%
Other values (92) 4672
9.5%
2025-01-23T18:14:06.066619image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 44176
11.6%
s 39996
10.5%
35164
 
9.2%
l 32335
 
8.5%
n 29920
 
7.9%
a 29252
 
7.7%
i 26201
 
6.9%
r 20211
 
5.3%
t 19486
 
5.1%
d 15077
 
4.0%
Other values (47) 88723
23.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 286156
75.2%
Uppercase Letter 49254
 
12.9%
Space Separator 35164
 
9.2%
Other Punctuation 9319
 
2.4%
Open Punctuation 324
 
0.1%
Close Punctuation 324
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 44176
15.4%
s 39996
14.0%
l 32335
11.3%
n 29920
10.5%
a 29252
10.2%
i 26201
9.2%
r 20211
7.1%
t 19486
6.8%
d 15077
 
5.3%
o 7381
 
2.6%
Other values (16) 22121
7.7%
Uppercase Letter
ValueCountFrequency (%)
I 14756
30.0%
A 8617
17.5%
W 8089
16.4%
P 5225
 
10.6%
G 4742
 
9.6%
C 4350
 
8.8%
L 1199
 
2.4%
M 529
 
1.1%
N 432
 
0.9%
S 257
 
0.5%
Other values (13) 1058
 
2.1%
Other Punctuation
ValueCountFrequency (%)
; 5830
62.6%
, 3483
37.4%
& 6
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 284
87.7%
[ 40
 
12.3%
Close Punctuation
ValueCountFrequency (%)
) 284
87.7%
] 40
 
12.3%
Space Separator
ValueCountFrequency (%)
35164
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 335410
88.1%
Common 45131
 
11.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 44176
13.2%
s 39996
11.9%
l 32335
9.6%
n 29920
8.9%
a 29252
8.7%
i 26201
7.8%
r 20211
 
6.0%
t 19486
 
5.8%
d 15077
 
4.5%
I 14756
 
4.4%
Other values (39) 64000
19.1%
Common
ValueCountFrequency (%)
35164
77.9%
; 5830
 
12.9%
, 3483
 
7.7%
( 284
 
0.6%
) 284
 
0.6%
[ 40
 
0.1%
] 40
 
0.1%
& 6
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 380399
> 99.9%
None 142
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 44176
11.6%
s 39996
10.5%
35164
 
9.2%
l 32335
 
8.5%
n 29920
 
7.9%
a 29252
 
7.7%
i 26201
 
6.9%
r 20211
 
5.3%
t 19486
 
5.1%
d 15077
 
4.0%
Other values (45) 88581
23.3%
None
ValueCountFrequency (%)
é 102
71.8%
á 40
 
28.2%

island
Text

Missing 

Distinct303
Distinct (%)1.9%
Missing462721
Missing (%)96.7%
Memory size3.7 MiB
2025-01-23T18:14:06.256649image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length40
Median length28
Mean length9.159139247
Min length3

Characters and Unicode

Total characters146418
Distinct characters66
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique79 ?
Unique (%)0.5%

Sample

1st rowCuba
2nd rowBarro Colorado
3rd rowTrinidad
4th rowHotsarihie ( = Helen Island)
5th rowJamaica
ValueCountFrequency (%)
island 3071
 
14.2%
jamaica 2531
 
11.7%
babeldaob 1667
 
7.7%
trinidad 947
 
4.4%
cuba 928
 
4.3%
puerto 894
 
4.1%
rico 894
 
4.1%
mindanao 843
 
3.9%
luzon 738
 
3.4%
staten 411
 
1.9%
Other values (320) 8666
40.1%
2025-01-23T18:14:06.527770image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 25209
17.2%
n 11031
 
7.5%
i 10226
 
7.0%
o 9732
 
6.6%
e 8305
 
5.7%
d 8081
 
5.5%
l 7714
 
5.3%
r 6132
 
4.2%
5604
 
3.8%
u 5096
 
3.5%
Other values (56) 49288
33.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 117898
80.5%
Uppercase Letter 22167
 
15.1%
Space Separator 5604
 
3.8%
Close Punctuation 307
 
0.2%
Open Punctuation 307
 
0.2%
Dash Punctuation 69
 
< 0.1%
Other Punctuation 40
 
< 0.1%
Math Symbol 26
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 25209
21.4%
n 11031
9.4%
i 10226
8.7%
o 9732
 
8.3%
e 8305
 
7.0%
d 8081
 
6.9%
l 7714
 
6.5%
r 6132
 
5.2%
u 5096
 
4.3%
b 4685
 
4.0%
Other values (21) 21687
18.4%
Uppercase Letter
ValueCountFrequency (%)
I 3324
15.0%
J 2581
11.6%
B 2287
10.3%
C 1927
8.7%
P 1872
8.4%
T 1793
8.1%
M 1444
6.5%
S 1426
6.4%
L 1118
 
5.0%
R 934
 
4.2%
Other values (16) 3461
15.6%
Close Punctuation
ValueCountFrequency (%)
) 288
93.8%
] 19
 
6.2%
Open Punctuation
ValueCountFrequency (%)
( 288
93.8%
[ 19
 
6.2%
Other Punctuation
ValueCountFrequency (%)
. 38
95.0%
' 2
 
5.0%
Space Separator
ValueCountFrequency (%)
5604
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 69
100.0%
Math Symbol
ValueCountFrequency (%)
= 26
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 140065
95.7%
Common 6353
 
4.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 25209
18.0%
n 11031
 
7.9%
i 10226
 
7.3%
o 9732
 
6.9%
e 8305
 
5.9%
d 8081
 
5.8%
l 7714
 
5.5%
r 6132
 
4.4%
u 5096
 
3.6%
b 4685
 
3.3%
Other values (47) 43854
31.3%
Common
ValueCountFrequency (%)
5604
88.2%
) 288
 
4.5%
( 288
 
4.5%
- 69
 
1.1%
. 38
 
0.6%
= 26
 
0.4%
[ 19
 
0.3%
] 19
 
0.3%
' 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 146202
99.9%
None 216
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 25209
17.2%
n 11031
 
7.5%
i 10226
 
7.0%
o 9732
 
6.7%
e 8305
 
5.7%
d 8081
 
5.5%
l 7714
 
5.3%
r 6132
 
4.2%
5604
 
3.8%
u 5096
 
3.5%
Other values (50) 49072
33.6%
None
ValueCountFrequency (%)
é 137
63.4%
ç 48
 
22.2%
á 14
 
6.5%
ā 8
 
3.7%
ó 8
 
3.7%
Á 1
 
0.5%

country
Text

Missing 

Distinct221
Distinct (%)< 0.1%
Missing30269
Missing (%)6.3%
Memory size3.7 MiB
2025-01-23T18:14:06.725722image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length48
Median length24
Mean length17.81093707
Min length4

Characters and Unicode

Total characters7987101
Distinct characters63
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)< 0.1%

Sample

1st rowUnited States of America
2nd rowUnited States of America
3rd rowCuba
4th rowUnited States of America
5th rowPanama
ValueCountFrequency (%)
of 281849
21.5%
united 281467
21.4%
states 281043
21.4%
america 280990
21.4%
méxico 21071
 
1.6%
panamá 16169
 
1.2%
canada 13721
 
1.0%
venezuela 11609
 
0.9%
costa 10437
 
0.8%
rica 10437
 
0.8%
Other values (239) 104576
 
8.0%
2025-01-23T18:14:06.984034image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 917087
11.5%
t 873342
 
10.9%
864931
 
10.8%
a 798920
 
10.0%
i 660135
 
8.3%
n 358626
 
4.5%
o 343323
 
4.3%
c 334763
 
4.2%
r 331614
 
4.2%
d 318472
 
4.0%
Other values (53) 2185888
27.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6089842
76.2%
Uppercase Letter 1031536
 
12.9%
Space Separator 864931
 
10.8%
Open Punctuation 392
 
< 0.1%
Close Punctuation 392
 
< 0.1%
Other Punctuation 7
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 917087
15.1%
t 873342
14.3%
a 798920
13.1%
i 660135
10.8%
n 358626
 
5.9%
o 343323
 
5.6%
c 334763
 
5.5%
r 331614
 
5.4%
d 318472
 
5.2%
s 312818
 
5.1%
Other values (22) 840742
13.8%
Uppercase Letter
ValueCountFrequency (%)
A 292578
28.4%
S 287432
27.9%
U 282432
27.4%
C 33567
 
3.3%
P 32001
 
3.1%
M 27853
 
2.7%
B 13263
 
1.3%
R 12792
 
1.2%
V 12087
 
1.2%
G 8835
 
0.9%
Other values (13) 28696
 
2.8%
Open Punctuation
ValueCountFrequency (%)
( 272
69.4%
[ 120
30.6%
Close Punctuation
ValueCountFrequency (%)
) 272
69.4%
] 120
30.6%
Other Punctuation
ValueCountFrequency (%)
' 5
71.4%
& 2
 
28.6%
Space Separator
ValueCountFrequency (%)
864931
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7121378
89.2%
Common 865723
 
10.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 917087
12.9%
t 873342
12.3%
a 798920
11.2%
i 660135
 
9.3%
n 358626
 
5.0%
o 343323
 
4.8%
c 334763
 
4.7%
r 331614
 
4.7%
d 318472
 
4.5%
s 312818
 
4.4%
Other values (45) 1872278
26.3%
Common
ValueCountFrequency (%)
864931
99.9%
( 272
 
< 0.1%
) 272
 
< 0.1%
[ 120
 
< 0.1%
] 120
 
< 0.1%
' 5
 
< 0.1%
& 2
 
< 0.1%
- 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7944496
99.5%
None 42605
 
0.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 917087
11.5%
t 873342
11.0%
864931
10.9%
a 798920
 
10.1%
i 660135
 
8.3%
n 358626
 
4.5%
o 343323
 
4.3%
c 334763
 
4.2%
r 331614
 
4.2%
d 318472
 
4.0%
Other values (47) 2143283
27.0%
None
ValueCountFrequency (%)
é 21079
49.5%
á 16169
38.0%
ú 5346
 
12.5%
ã 5
 
< 0.1%
í 5
 
< 0.1%
ç 1
 
< 0.1%

stateProvince
Text

Missing 

Distinct1525
Distinct (%)0.4%
Missing63499
Missing (%)13.3%
Memory size3.7 MiB
2025-01-23T18:14:07.184095image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length42
Median length34
Mean length8.228003796
Min length3

Characters and Unicode

Total characters3416333
Distinct characters97
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique327 ?
Unique (%)0.1%

Sample

1st rowMichigan
2nd rowIndiana
3rd rowSantiago
4th rowDistrict of Columbia
5th rowPanamá
ValueCountFrequency (%)
illinois 45410
 
9.4%
florida 25929
 
5.4%
california 25868
 
5.3%
new 18326
 
3.8%
arizona 16968
 
3.5%
indiana 16676
 
3.4%
texas 14876
 
3.1%
oregon 14014
 
2.9%
wisconsin 10548
 
2.2%
colorado 10525
 
2.2%
Other values (1562) 284927
58.9%
2025-01-23T18:14:07.462847image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 436431
12.8%
i 372665
 
10.9%
o 317866
 
9.3%
n 312356
 
9.1%
l 221886
 
6.5%
r 201109
 
5.9%
s 185399
 
5.4%
e 174825
 
5.1%
d 82772
 
2.4%
t 77207
 
2.3%
Other values (87) 1033817
30.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2866564
83.9%
Uppercase Letter 479173
 
14.0%
Space Separator 68859
 
2.0%
Dash Punctuation 850
 
< 0.1%
Other Punctuation 681
 
< 0.1%
Open Punctuation 102
 
< 0.1%
Close Punctuation 102
 
< 0.1%
Final Punctuation 1
 
< 0.1%
Other Letter 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 436431
15.2%
i 372665
13.0%
o 317866
11.1%
n 312356
10.9%
l 221886
7.7%
r 201109
7.0%
s 185399
 
6.5%
e 174825
 
6.1%
d 82772
 
2.9%
t 77207
 
2.7%
Other values (38) 484048
16.9%
Uppercase Letter
ValueCountFrequency (%)
I 67867
14.2%
C 67377
14.1%
A 37319
 
7.8%
M 33725
 
7.0%
N 31277
 
6.5%
F 29269
 
6.1%
O 28779
 
6.0%
T 27289
 
5.7%
W 24163
 
5.0%
S 21089
 
4.4%
Other values (25) 111019
23.2%
Other Punctuation
ValueCountFrequency (%)
. 609
89.4%
' 29
 
4.3%
, 18
 
2.6%
/ 12
 
1.8%
? 11
 
1.6%
& 2
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 98
96.1%
[ 4
 
3.9%
Close Punctuation
ValueCountFrequency (%)
) 98
96.1%
] 4
 
3.9%
Space Separator
ValueCountFrequency (%)
68859
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 850
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Other Letter
ValueCountFrequency (%)
º 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3345738
97.9%
Common 70595
 
2.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 436431
13.0%
i 372665
 
11.1%
o 317866
 
9.5%
n 312356
 
9.3%
l 221886
 
6.6%
r 201109
 
6.0%
s 185399
 
5.5%
e 174825
 
5.2%
d 82772
 
2.5%
t 77207
 
2.3%
Other values (74) 963222
28.8%
Common
ValueCountFrequency (%)
68859
97.5%
- 850
 
1.2%
. 609
 
0.9%
( 98
 
0.1%
) 98
 
0.1%
' 29
 
< 0.1%
, 18
 
< 0.1%
/ 12
 
< 0.1%
? 11
 
< 0.1%
[ 4
 
< 0.1%
Other values (3) 7
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3395926
99.4%
None 20405
 
0.6%
Punctuation 1
 
< 0.1%
Latin Ext Additional 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 436431
12.9%
i 372665
 
11.0%
o 317866
 
9.4%
n 312356
 
9.2%
l 221886
 
6.5%
r 201109
 
5.9%
s 185399
 
5.5%
e 174825
 
5.1%
d 82772
 
2.4%
t 77207
 
2.3%
Other values (54) 1013410
29.8%
None
ValueCountFrequency (%)
í 6997
34.3%
á 4892
24.0%
é 3253
15.9%
ó 3059
15.0%
ã 1130
 
5.5%
ú 443
 
2.2%
ö 149
 
0.7%
ô 100
 
0.5%
ü 72
 
0.4%
ĩ 58
 
0.3%
Other values (21) 252
 
1.2%
Punctuation
ValueCountFrequency (%)
1
100.0%
Latin Ext Additional
ValueCountFrequency (%)
1
100.0%

county
Text

Missing 

Distinct2148
Distinct (%)0.8%
Missing201710
Missing (%)42.1%
Memory size3.7 MiB
2025-01-23T18:14:07.662439image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length25
Mean length6.963100683
Min length2

Characters and Unicode

Total characters1928758
Distinct characters78
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique295 ?
Unique (%)0.1%

Sample

1st rowOakland
2nd rowBrown
3rd rowCanal Zone
4th rowCameron
5th rowMontague
ValueCountFrequency (%)
cook 22104
 
7.2%
lake 7403
 
2.4%
highlands 5120
 
1.7%
san 4549
 
1.5%
cochise 4471
 
1.5%
alachua 4131
 
1.3%
daviess 3846
 
1.3%
jo 3845
 
1.3%
jefferson 3466
 
1.1%
maricopa 2985
 
1.0%
Other values (2189) 245385
79.9%
2025-01-23T18:14:07.932660image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 208348
 
10.8%
o 181267
 
9.4%
e 180435
 
9.4%
n 138236
 
7.2%
r 121498
 
6.3%
i 106779
 
5.5%
l 91125
 
4.7%
s 87266
 
4.5%
t 62764
 
3.3%
C 51939
 
2.7%
Other values (68) 699101
36.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1579492
81.9%
Uppercase Letter 314422
 
16.3%
Space Separator 30311
 
1.6%
Other Punctuation 2748
 
0.1%
Dash Punctuation 1778
 
0.1%
Decimal Number 7
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 208348
13.2%
o 181267
11.5%
e 180435
11.4%
n 138236
 
8.8%
r 121498
 
7.7%
i 106779
 
6.8%
l 91125
 
5.8%
s 87266
 
5.5%
t 62764
 
4.0%
k 51769
 
3.3%
Other values (30) 350005
22.2%
Uppercase Letter
ValueCountFrequency (%)
C 51939
16.5%
M 29536
 
9.4%
L 27870
 
8.9%
P 22247
 
7.1%
S 22008
 
7.0%
B 19508
 
6.2%
D 19228
 
6.1%
H 16025
 
5.1%
A 15926
 
5.1%
J 13989
 
4.4%
Other values (19) 76146
24.2%
Other Punctuation
ValueCountFrequency (%)
. 2482
90.3%
' 193
 
7.0%
, 54
 
2.0%
/ 19
 
0.7%
Decimal Number
ValueCountFrequency (%)
3 4
57.1%
1 2
28.6%
5 1
 
14.3%
Space Separator
ValueCountFrequency (%)
30311
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1778
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1893914
98.2%
Common 34844
 
1.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 208348
 
11.0%
o 181267
 
9.6%
e 180435
 
9.5%
n 138236
 
7.3%
r 121498
 
6.4%
i 106779
 
5.6%
l 91125
 
4.8%
s 87266
 
4.6%
t 62764
 
3.3%
C 51939
 
2.7%
Other values (59) 664257
35.1%
Common
ValueCountFrequency (%)
30311
87.0%
. 2482
 
7.1%
- 1778
 
5.1%
' 193
 
0.6%
, 54
 
0.2%
/ 19
 
0.1%
3 4
 
< 0.1%
1 2
 
< 0.1%
5 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1927944
> 99.9%
None 814
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 208348
 
10.8%
o 181267
 
9.4%
e 180435
 
9.4%
n 138236
 
7.2%
r 121498
 
6.3%
i 106779
 
5.5%
l 91125
 
4.7%
s 87266
 
4.5%
t 62764
 
3.3%
C 51939
 
2.7%
Other values (51) 698287
36.2%
None
ValueCountFrequency (%)
á 317
38.9%
é 137
16.8%
í 92
 
11.3%
ó 82
 
10.1%
Ñ 55
 
6.8%
ł 45
 
5.5%
è 24
 
2.9%
ō 15
 
1.8%
ü 12
 
1.5%
ñ 11
 
1.4%
Other values (7) 24
 
2.9%

locality
Text

Missing 

Distinct40811
Distinct (%)9.6%
Missing53551
Missing (%)11.2%
Memory size3.7 MiB
2025-01-23T18:14:08.134797image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length95
Median length72
Mean length24.23723763
Min length2

Characters and Unicode

Total characters10304607
Distinct characters144
Distinct categories14 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14890 ?
Unique (%)3.5%

Sample

1st rowPaint Creek, Clarkston Road, nr jct Kern Road, 8.5 mi N Pontiac
2nd rowGoodley Branch, S of Belmont
3rd row16 km NE Caney
4th rowWashington D.C.
5th rowFairchild Trail 15.3
ValueCountFrequency (%)
mi 60996
 
3.5%
of 54402
 
3.1%
park 28621
 
1.6%
km 22915
 
1.3%
river 22465
 
1.3%
creek 21970
 
1.3%
s 21390
 
1.2%
w 19544
 
1.1%
lake 19293
 
1.1%
n 18716
 
1.1%
Other values (27739) 1453196
83.3%
2025-01-23T18:14:08.407927image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1319154
 
12.8%
a 899842
 
8.7%
e 705798
 
6.8%
o 654547
 
6.4%
i 566066
 
5.5%
r 542521
 
5.3%
n 511490
 
5.0%
l 411112
 
4.0%
t 409445
 
4.0%
s 302777
 
2.9%
Other values (134) 3981855
38.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6783121
65.8%
Uppercase Letter 1602924
 
15.6%
Space Separator 1319155
 
12.8%
Other Punctuation 345696
 
3.4%
Decimal Number 232660
 
2.3%
Dash Punctuation 8717
 
0.1%
Open Punctuation 5222
 
0.1%
Close Punctuation 5075
 
< 0.1%
Other Symbol 1118
 
< 0.1%
Math Symbol 795
 
< 0.1%
Other values (4) 124
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 899842
13.3%
e 705798
10.4%
o 654547
9.6%
i 566066
 
8.3%
r 542521
 
8.0%
n 511490
 
7.5%
l 411112
 
6.1%
t 409445
 
6.0%
s 302777
 
4.5%
u 229642
 
3.4%
Other values (55) 1549881
22.8%
Uppercase Letter
ValueCountFrequency (%)
S 173121
 
10.8%
C 160287
 
10.0%
P 123546
 
7.7%
R 112339
 
7.0%
N 106638
 
6.7%
E 92464
 
5.8%
W 91327
 
5.7%
L 88670
 
5.5%
M 86887
 
5.4%
A 80770
 
5.0%
Other values (28) 486875
30.4%
Other Punctuation
ValueCountFrequency (%)
, 260950
75.5%
. 55013
 
15.9%
; 23223
 
6.7%
' 2696
 
0.8%
# 1286
 
0.4%
" 1039
 
0.3%
& 551
 
0.2%
/ 483
 
0.1%
? 244
 
0.1%
: 176
 
0.1%
Other values (3) 35
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 47496
20.4%
5 33472
14.4%
2 32745
14.1%
3 25109
10.8%
0 23339
10.0%
4 19738
8.5%
6 15598
 
6.7%
8 12449
 
5.4%
7 12411
 
5.3%
9 10303
 
4.4%
Math Symbol
ValueCountFrequency (%)
= 414
52.1%
~ 334
42.0%
+ 41
 
5.2%
| 6
 
0.8%
Space Separator
ValueCountFrequency (%)
1319154
> 99.9%
1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 8707
99.9%
10
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 5214
99.8%
[ 8
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 5068
99.9%
] 7
 
0.1%
Final Punctuation
ValueCountFrequency (%)
22
75.9%
7
 
24.1%
Other Symbol
ValueCountFrequency (%)
° 1118
100.0%
Other Letter
ValueCountFrequency (%)
º 68
100.0%
Initial Punctuation
ValueCountFrequency (%)
22
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8386113
81.4%
Common 1918494
 
18.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 899842
 
10.7%
e 705798
 
8.4%
o 654547
 
7.8%
i 566066
 
6.8%
r 542521
 
6.5%
n 511490
 
6.1%
l 411112
 
4.9%
t 409445
 
4.9%
s 302777
 
3.6%
u 229642
 
2.7%
Other values (94) 3152873
37.6%
Common
ValueCountFrequency (%)
1319154
68.8%
, 260950
 
13.6%
. 55013
 
2.9%
1 47496
 
2.5%
5 33472
 
1.7%
2 32745
 
1.7%
3 25109
 
1.3%
0 23339
 
1.2%
; 23223
 
1.2%
4 19738
 
1.0%
Other values (30) 78255
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10282702
99.8%
None 21843
 
0.2%
Punctuation 62
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1319154
 
12.8%
a 899842
 
8.8%
e 705798
 
6.9%
o 654547
 
6.4%
i 566066
 
5.5%
r 542521
 
5.3%
n 511490
 
5.0%
l 411112
 
4.0%
t 409445
 
4.0%
s 302777
 
2.9%
Other values (75) 3959950
38.5%
None
ValueCountFrequency (%)
é 3889
17.8%
í 3443
15.8%
á 3191
14.6%
ó 2575
11.8%
ô 1689
7.7%
ñ 1632
7.5%
° 1118
 
5.1%
ú 842
 
3.9%
ã 378
 
1.7%
è 368
 
1.7%
Other values (44) 2718
12.4%
Punctuation
ValueCountFrequency (%)
22
35.5%
22
35.5%
10
16.1%
7
 
11.3%
1
 
1.6%

minimumElevationInMeters
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing478706
Missing (%)> 99.9%
Memory size3.7 MiB
2025-01-23T18:14:08.460728image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters2
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row67
ValueCountFrequency (%)
67 1
100.0%
2025-01-23T18:14:08.556801image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 1
50.0%
7 1
50.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 1
50.0%
7 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 1
50.0%
7 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 1
50.0%
7 1
50.0%
Distinct5
Distinct (%)100.0%
Missing478702
Missing (%)> 99.9%
Memory size3.7 MiB
2025-01-23T18:14:08.603039image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length2
Mean length2.2
Min length1

Characters and Unicode

Total characters11
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)100.0%

Sample

1st row67
2nd row18
3rd row66
4th row2800
5th row3
ValueCountFrequency (%)
67 1
20.0%
18 1
20.0%
66 1
20.0%
2800 1
20.0%
3 1
20.0%
2025-01-23T18:14:08.711510image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 3
27.3%
8 2
18.2%
0 2
18.2%
7 1
 
9.1%
1 1
 
9.1%
2 1
 
9.1%
3 1
 
9.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 3
27.3%
8 2
18.2%
0 2
18.2%
7 1
 
9.1%
1 1
 
9.1%
2 1
 
9.1%
3 1
 
9.1%

Most occurring scripts

ValueCountFrequency (%)
Common 11
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 3
27.3%
8 2
18.2%
0 2
18.2%
7 1
 
9.1%
1 1
 
9.1%
2 1
 
9.1%
3 1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 3
27.3%
8 2
18.2%
0 2
18.2%
7 1
 
9.1%
1 1
 
9.1%
2 1
 
9.1%
3 1
 
9.1%

verbatimElevation
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing478706
Missing (%)> 99.9%
Memory size3.7 MiB
2025-01-23T18:14:08.757463image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters10
Distinct characters6
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row-17.816667
ValueCountFrequency (%)
17.816667 1
100.0%
2025-01-23T18:14:08.852317image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 3
30.0%
1 2
20.0%
7 2
20.0%
- 1
 
10.0%
. 1
 
10.0%
8 1
 
10.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8
80.0%
Dash Punctuation 1
 
10.0%
Other Punctuation 1
 
10.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 3
37.5%
1 2
25.0%
7 2
25.0%
8 1
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 3
30.0%
1 2
20.0%
7 2
20.0%
- 1
 
10.0%
. 1
 
10.0%
8 1
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 3
30.0%
1 2
20.0%
7 2
20.0%
- 1
 
10.0%
. 1
 
10.0%
8 1
 
10.0%

verticalDatum
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing478706
Missing (%)> 99.9%
Memory size3.7 MiB
2025-01-23T18:14:08.896521image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters10
Distinct characters7
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row-64.216667
ValueCountFrequency (%)
64.216667 1
100.0%
2025-01-23T18:14:08.994534image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 4
40.0%
- 1
 
10.0%
4 1
 
10.0%
. 1
 
10.0%
2 1
 
10.0%
1 1
 
10.0%
7 1
 
10.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8
80.0%
Dash Punctuation 1
 
10.0%
Other Punctuation 1
 
10.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 4
50.0%
4 1
 
12.5%
2 1
 
12.5%
1 1
 
12.5%
7 1
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 4
40.0%
- 1
 
10.0%
4 1
 
10.0%
. 1
 
10.0%
2 1
 
10.0%
1 1
 
10.0%
7 1
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 4
40.0%
- 1
 
10.0%
4 1
 
10.0%
. 1
 
10.0%
2 1
 
10.0%
1 1
 
10.0%
7 1
 
10.0%

minimumDepthInMeters
Text

Missing 

Distinct17
Distinct (%)2.3%
Missing477976
Missing (%)99.8%
Memory size3.7 MiB
2025-01-23T18:14:09.058350image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length6
Mean length6.307797538
Min length4

Characters and Unicode

Total characters4611
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.1303
2nd row1.1303
3rd row1.50114
4th row0.8636
5th row1.27
ValueCountFrequency (%)
0.9398 96
13.1%
1.5621 66
9.0%
1.76784 63
8.6%
0.8636 57
 
7.8%
1.50114 56
 
7.7%
1.3843 56
 
7.7%
1.41986 50
 
6.8%
1.4605 48
 
6.6%
1.29032 45
 
6.2%
1.1303 39
 
5.3%
Other values (7) 155
21.2%
2025-01-23T18:14:09.190841image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 800
17.3%
. 731
15.9%
8 445
9.7%
0 442
9.6%
3 402
8.7%
6 387
8.4%
4 370
8.0%
9 335
7.3%
2 248
 
5.4%
5 239
 
5.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3880
84.1%
Other Punctuation 731
 
15.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 800
20.6%
8 445
11.5%
0 442
11.4%
3 402
10.4%
6 387
10.0%
4 370
9.5%
9 335
8.6%
2 248
 
6.4%
5 239
 
6.2%
7 212
 
5.5%
Other Punctuation
ValueCountFrequency (%)
. 731
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4611
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 800
17.3%
. 731
15.9%
8 445
9.7%
0 442
9.6%
3 402
8.7%
6 387
8.4%
4 370
8.0%
9 335
7.3%
2 248
 
5.4%
5 239
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4611
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 800
17.3%
. 731
15.9%
8 445
9.7%
0 442
9.6%
3 402
8.7%
6 387
8.4%
4 370
8.0%
9 335
7.3%
2 248
 
5.4%
5 239
 
5.2%

locationRemarks
Text

Missing 

Distinct3457
Distinct (%)3.8%
Missing387879
Missing (%)81.0%
Memory size3.7 MiB
2025-01-23T18:14:09.386302image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length648
Median length89
Mean length91.49017924
Min length4

Characters and Unicode

Total characters8309870
Distinct characters130
Distinct categories16 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1257 ?
Unique (%)1.4%

Sample

1st rowPreviously recorded as: North America, U.S.A., Michigan, Oakland: Paint Creek, Clarkston road, near junction Kern road (8.5 mi N Pontiac)
2nd rowPer Al Newton -- "15.3" are distances that are marked along the Barro Colorado Island trails, I think in 100m increments from the start of the trail near the middle of the island going out toward the lake. They are marked on the map in this link:||http://biogeodb.stri.si.edu/bioinformatics/bci_soil_map/location_and_access.php
3rd rowData entry by Robin DeLaPena under funding from the InvertEBase TCN (NSF Award # 1402667)
4th rowData entry by Robin DeLaPena under funding from the InvertEBase TCN (NSF Award # 1402667)
5th rowData entry by Robin DeLaPena under funding from the InvertEBase TCN (NSF Award # 1402667)
ValueCountFrequency (%)
the 68862
 
5.2%
56391
 
4.2%
by 55215
 
4.1%
from 53670
 
4.0%
data 52880
 
4.0%
under 52800
 
4.0%
tcn 52753
 
4.0%
award 52753
 
4.0%
nsf 52753
 
4.0%
entry 52753
 
4.0%
Other values (7547) 782899
58.7%
2025-01-23T18:14:09.664933image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1245206
 
15.0%
e 650905
 
7.8%
n 551054
 
6.6%
a 546593
 
6.6%
r 454882
 
5.5%
t 400640
 
4.8%
o 335311
 
4.0%
i 326947
 
3.9%
d 248072
 
3.0%
s 206757
 
2.5%
Other values (120) 3343503
40.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5136529
61.8%
Space Separator 1245206
 
15.0%
Uppercase Letter 1105372
 
13.3%
Decimal Number 451286
 
5.4%
Other Punctuation 235102
 
2.8%
Close Punctuation 59025
 
0.7%
Open Punctuation 58947
 
0.7%
Math Symbol 7836
 
0.1%
Dash Punctuation 7345
 
0.1%
Connector Punctuation 3087
 
< 0.1%
Other values (6) 135
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 650905
12.7%
n 551054
10.7%
a 546593
10.6%
r 454882
 
8.9%
t 400640
 
7.8%
o 335311
 
6.5%
i 326947
 
6.4%
d 248072
 
4.8%
s 206757
 
4.0%
u 186912
 
3.6%
Other values (42) 1228456
23.9%
Uppercase Letter
ValueCountFrequency (%)
N 126853
11.5%
D 114567
10.4%
S 89596
 
8.1%
C 85842
 
7.8%
A 84019
 
7.6%
P 80028
 
7.2%
T 67869
 
6.1%
B 64808
 
5.9%
I 64096
 
5.8%
R 63427
 
5.7%
Other values (20) 264267
23.9%
Other Punctuation
ValueCountFrequency (%)
. 72399
30.8%
, 56598
24.1%
# 52982
22.5%
: 26643
 
11.3%
/ 11413
 
4.9%
" 6577
 
2.8%
; 4990
 
2.1%
& 1358
 
0.6%
' 1133
 
0.5%
? 583
 
0.2%
Other values (6) 426
 
0.2%
Decimal Number
ValueCountFrequency (%)
6 111586
24.7%
1 67830
15.0%
0 66531
14.7%
2 63569
14.1%
4 58457
13.0%
7 56617
12.5%
9 9320
 
2.1%
5 6088
 
1.3%
8 5736
 
1.3%
3 5552
 
1.2%
Math Symbol
ValueCountFrequency (%)
| 5804
74.1%
= 1609
 
20.5%
+ 274
 
3.5%
~ 121
 
1.5%
< 16
 
0.2%
> 6
 
0.1%
¬ 6
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 58810
99.6%
] 215
 
0.4%
Open Punctuation
ValueCountFrequency (%)
( 58760
99.7%
[ 187
 
0.3%
Dash Punctuation
ValueCountFrequency (%)
- 6608
90.0%
737
 
10.0%
Final Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
1245206
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3087
100.0%
Other Symbol
ValueCountFrequency (%)
° 110
100.0%
Other Letter
ValueCountFrequency (%)
º 20
100.0%
Modifier Letter
ValueCountFrequency (%)
1
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6241922
75.1%
Common 2067948
 
24.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 650905
 
10.4%
n 551054
 
8.8%
a 546593
 
8.8%
r 454882
 
7.3%
t 400640
 
6.4%
o 335311
 
5.4%
i 326947
 
5.2%
d 248072
 
4.0%
s 206757
 
3.3%
u 186912
 
3.0%
Other values (74) 2333849
37.4%
Common
ValueCountFrequency (%)
1245206
60.2%
6 111586
 
5.4%
. 72399
 
3.5%
1 67830
 
3.3%
0 66531
 
3.2%
2 63569
 
3.1%
) 58810
 
2.8%
( 58760
 
2.8%
4 58457
 
2.8%
7 56617
 
2.7%
Other values (36) 208183
 
10.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8306510
> 99.9%
None 2605
 
< 0.1%
Punctuation 754
 
< 0.1%
Phonetic Ext 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1245206
 
15.0%
e 650905
 
7.8%
n 551054
 
6.6%
a 546593
 
6.6%
r 454882
 
5.5%
t 400640
 
4.8%
o 335311
 
4.0%
i 326947
 
3.9%
d 248072
 
3.0%
s 206757
 
2.5%
Other values (79) 3340143
40.2%
None
ValueCountFrequency (%)
á 1053
40.4%
ó 616
23.6%
é 235
 
9.0%
í 135
 
5.2%
ã 127
 
4.9%
° 110
 
4.2%
ń 107
 
4.1%
ç 52
 
2.0%
ą 30
 
1.2%
Ö 21
 
0.8%
Other values (23) 119
 
4.6%
Punctuation
ValueCountFrequency (%)
737
97.7%
6
 
0.8%
4
 
0.5%
4
 
0.5%
1
 
0.1%
1
 
0.1%
1
 
0.1%
Phonetic Ext
ValueCountFrequency (%)
1
100.0%

decimalLatitude
Text

Missing 

Distinct27638
Distinct (%)7.9%
Missing127471
Missing (%)26.6%
Memory size3.7 MiB
2025-01-23T18:14:09.891659image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length9
Mean length8.563959845
Min length1

Characters and Unicode

Total characters3007971
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8466 ?
Unique (%)2.4%

Sample

1st row42.767485
2nd row39.127424
3rd row20.168685
4th row38.893829
5th row9.17
ValueCountFrequency (%)
41.853613 3555
 
1.0%
27.189423 3410
 
1.0%
9.15947 3342
 
1.0%
41.675 2412
 
0.7%
41.6767 2072
 
0.6%
41.737659 2024
 
0.6%
29.577609 1835
 
0.5%
41.72908 1834
 
0.5%
36.293963 1497
 
0.4%
30.610867 1452
 
0.4%
Other values (27552) 327803
93.3%
2025-01-23T18:14:10.190174image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 352282
11.7%
. 350941
11.7%
3 343709
11.4%
1 299770
10.0%
2 273932
9.1%
7 249737
8.3%
6 236974
7.9%
8 233589
7.8%
9 229268
7.6%
5 218869
7.3%
Other values (2) 218900
7.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2638573
87.7%
Other Punctuation 350941
 
11.7%
Dash Punctuation 18457
 
0.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 352282
13.4%
3 343709
13.0%
1 299770
11.4%
2 273932
10.4%
7 249737
9.5%
6 236974
9.0%
8 233589
8.9%
9 229268
8.7%
5 218869
8.3%
0 200443
7.6%
Other Punctuation
ValueCountFrequency (%)
. 350941
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18457
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3007971
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 352282
11.7%
. 350941
11.7%
3 343709
11.4%
1 299770
10.0%
2 273932
9.1%
7 249737
8.3%
6 236974
7.9%
8 233589
7.8%
9 229268
7.6%
5 218869
7.3%
Other values (2) 218900
7.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3007971
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 352282
11.7%
. 350941
11.7%
3 343709
11.4%
1 299770
10.0%
2 273932
9.1%
7 249737
8.3%
6 236974
7.9%
8 233589
7.8%
9 229268
7.6%
5 218869
7.3%
Other values (2) 218900
7.3%

decimalLongitude
Text

Missing 

Distinct27789
Distinct (%)7.9%
Missing127471
Missing (%)26.6%
Memory size3.7 MiB
2025-01-23T18:14:10.415548image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length10
Mean length9.841260577
Min length1

Characters and Unicode

Total characters3456605
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8573 ?
Unique (%)2.4%

Sample

1st row-83.218602
2nd row-86.37487
3rd row-75.699838
4th row-77.032022
5th row-79.85
ValueCountFrequency (%)
87.685758 3555
 
1.0%
81.339289 3410
 
1.0%
79.846034 3342
 
1.0%
87.697554 2012
 
0.6%
82.309813 1835
 
0.5%
87.88162 1834
 
0.5%
84.751612 1497
 
0.4%
103.875753 1452
 
0.4%
86.894869 1407
 
0.4%
111.961673 1370
 
0.4%
Other values (27755) 329522
93.8%
2025-01-23T18:14:10.713708image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8 381955
11.1%
1 375310
10.9%
. 350726
10.1%
- 334095
9.7%
7 309782
9.0%
9 273103
7.9%
3 251618
7.3%
2 251372
7.3%
6 250135
7.2%
5 231810
6.7%
Other values (2) 446699
12.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2771784
80.2%
Other Punctuation 350726
 
10.1%
Dash Punctuation 334095
 
9.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 381955
13.8%
1 375310
13.5%
7 309782
11.2%
9 273103
9.9%
3 251618
9.1%
2 251372
9.1%
6 250135
9.0%
5 231810
8.4%
4 227185
8.2%
0 219514
7.9%
Other Punctuation
ValueCountFrequency (%)
. 350726
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 334095
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3456605
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
8 381955
11.1%
1 375310
10.9%
. 350726
10.1%
- 334095
9.7%
7 309782
9.0%
9 273103
7.9%
3 251618
7.3%
2 251372
7.3%
6 250135
7.2%
5 231810
6.7%
Other values (2) 446699
12.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3456605
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8 381955
11.1%
1 375310
10.9%
. 350726
10.1%
- 334095
9.7%
7 309782
9.0%
9 273103
7.9%
3 251618
7.3%
2 251372
7.3%
6 250135
7.2%
5 231810
6.7%
Other values (2) 446699
12.9%

geodeticDatum
Text

Missing 

Distinct8
Distinct (%)1.1%
Missing477980
Missing (%)99.8%
Memory size3.7 MiB
2025-01-23T18:14:10.781567image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length34
Median length5
Mean length5.431911967
Min length5

Characters and Unicode

Total characters3949
Distinct characters34
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rowWGS84
2nd rowWGS84
3rd rowWGS84
4th rowWGS84
5th rowWGS84
ValueCountFrequency (%)
wgs84 560
74.3%
nad27 94
 
12.5%
nad83 23
 
3.1%
unknown 19
 
2.5%
not 17
 
2.3%
recorded 17
 
2.3%
nad83/wgs84 14
 
1.9%
north 2
 
0.3%
american 2
 
0.3%
1927 2
 
0.3%
Other values (4) 4
 
0.5%
2025-01-23T18:14:10.903780image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8 612
15.5%
W 575
14.6%
G 575
14.6%
S 575
14.6%
4 575
14.6%
N 133
 
3.4%
A 133
 
3.4%
D 131
 
3.3%
2 96
 
2.4%
7 96
 
2.4%
Other values (24) 448
11.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2122
53.7%
Decimal Number 1422
36.0%
Lowercase Letter 358
 
9.1%
Space Separator 27
 
0.7%
Other Punctuation 14
 
0.4%
Open Punctuation 3
 
0.1%
Close Punctuation 3
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 76
21.2%
o 57
15.9%
e 39
10.9%
r 39
10.9%
d 36
10.1%
t 21
 
5.9%
c 20
 
5.6%
k 19
 
5.3%
u 19
 
5.3%
w 19
 
5.3%
Other values (7) 13
 
3.6%
Decimal Number
ValueCountFrequency (%)
8 612
43.0%
4 575
40.4%
2 96
 
6.8%
7 96
 
6.8%
3 37
 
2.6%
1 3
 
0.2%
9 3
 
0.2%
Uppercase Letter
ValueCountFrequency (%)
W 575
27.1%
G 575
27.1%
S 575
27.1%
N 133
 
6.3%
A 133
 
6.3%
D 131
 
6.2%
Space Separator
ValueCountFrequency (%)
27
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 14
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2480
62.8%
Common 1469
37.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
W 575
23.2%
G 575
23.2%
S 575
23.2%
N 133
 
5.4%
A 133
 
5.4%
D 131
 
5.3%
n 76
 
3.1%
o 57
 
2.3%
e 39
 
1.6%
r 39
 
1.6%
Other values (13) 147
 
5.9%
Common
ValueCountFrequency (%)
8 612
41.7%
4 575
39.1%
2 96
 
6.5%
7 96
 
6.5%
3 37
 
2.5%
27
 
1.8%
/ 14
 
1.0%
1 3
 
0.2%
( 3
 
0.2%
) 3
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3949
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8 612
15.5%
W 575
14.6%
G 575
14.6%
S 575
14.6%
4 575
14.6%
N 133
 
3.4%
A 133
 
3.4%
D 131
 
3.3%
2 96
 
2.4%
7 96
 
2.4%
Other values (24) 448
11.3%
Distinct194
Distinct (%)13.9%
Missing477312
Missing (%)99.7%
Memory size3.7 MiB
2025-01-23T18:14:11.075791image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length7
Mean length4.431541219
Min length1

Characters and Unicode

Total characters6182
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique94 ?
Unique (%)6.7%

Sample

1st row3036
2nd row58977
3rd row41000
4th row58977
5th row3036
ValueCountFrequency (%)
58977 436
31.3%
3036 160
 
11.5%
100 52
 
3.7%
0.835 48
 
3.4%
70 29
 
2.1%
50 25
 
1.8%
20 25
 
1.8%
1830 23
 
1.6%
200 23
 
1.6%
2684 21
 
1.5%
Other values (184) 553
39.6%
2025-01-23T18:14:11.317855image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 1126
18.2%
5 797
12.9%
0 780
12.6%
3 743
12.0%
8 703
11.4%
9 579
9.4%
1 460
7.4%
6 335
 
5.4%
4 309
 
5.0%
2 231
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6063
98.1%
Other Punctuation 119
 
1.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 1126
18.6%
5 797
13.1%
0 780
12.9%
3 743
12.3%
8 703
11.6%
9 579
9.5%
1 460
7.6%
6 335
 
5.5%
4 309
 
5.1%
2 231
 
3.8%
Other Punctuation
ValueCountFrequency (%)
. 119
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6182
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
7 1126
18.2%
5 797
12.9%
0 780
12.6%
3 743
12.0%
8 703
11.4%
9 579
9.4%
1 460
7.4%
6 335
 
5.4%
4 309
 
5.0%
2 231
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6182
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 1126
18.2%
5 797
12.9%
0 780
12.6%
3 743
12.0%
8 703
11.4%
9 579
9.4%
1 460
7.4%
6 335
 
5.4%
4 309
 
5.0%
2 231
 
3.7%

pointRadiusSpatialFit
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing478706
Missing (%)> 99.9%
Memory size3.7 MiB
2025-01-23T18:14:11.383197image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length73
Median length73
Mean length73
Min length73

Characters and Unicode

Total characters73
Distinct characters29
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowPatrick Belenky : Field Museum of Natural History - Department of Zoology
ValueCountFrequency (%)
2
16.7%
of 2
16.7%
patrick 1
8.3%
belenky 1
8.3%
field 1
8.3%
museum 1
8.3%
natural 1
8.3%
history 1
8.3%
department 1
8.3%
zoology 1
8.3%
2025-01-23T18:14:11.499858image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11
15.1%
o 6
 
8.2%
e 6
 
8.2%
t 5
 
6.8%
a 4
 
5.5%
r 4
 
5.5%
l 4
 
5.5%
i 3
 
4.1%
u 3
 
4.1%
y 3
 
4.1%
Other values (19) 24
32.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 52
71.2%
Space Separator 11
 
15.1%
Uppercase Letter 8
 
11.0%
Dash Punctuation 1
 
1.4%
Other Punctuation 1
 
1.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 6
11.5%
e 6
11.5%
t 5
9.6%
a 4
 
7.7%
r 4
 
7.7%
l 4
 
7.7%
i 3
 
5.8%
u 3
 
5.8%
y 3
 
5.8%
n 2
 
3.8%
Other values (8) 12
23.1%
Uppercase Letter
ValueCountFrequency (%)
D 1
12.5%
H 1
12.5%
N 1
12.5%
Z 1
12.5%
P 1
12.5%
F 1
12.5%
M 1
12.5%
B 1
12.5%
Space Separator
ValueCountFrequency (%)
11
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 60
82.2%
Common 13
 
17.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 6
 
10.0%
e 6
 
10.0%
t 5
 
8.3%
a 4
 
6.7%
r 4
 
6.7%
l 4
 
6.7%
i 3
 
5.0%
u 3
 
5.0%
y 3
 
5.0%
n 2
 
3.3%
Other values (16) 20
33.3%
Common
ValueCountFrequency (%)
11
84.6%
- 1
 
7.7%
: 1
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 73
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11
15.1%
o 6
 
8.2%
e 6
 
8.2%
t 5
 
6.8%
a 4
 
5.5%
r 4
 
5.5%
l 4
 
5.5%
i 3
 
4.1%
u 3
 
4.1%
y 3
 
4.1%
Other values (19) 24
32.9%

verbatimCoordinates
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing478706
Missing (%)> 99.9%
Memory size3.7 MiB
2025-01-23T18:14:11.545934image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters4
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row2013
ValueCountFrequency (%)
2013 1
100.0%
2025-01-23T18:14:11.644840image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 1
25.0%
0 1
25.0%
1 1
25.0%
3 1
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 1
25.0%
0 1
25.0%
1 1
25.0%
3 1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 1
25.0%
0 1
25.0%
1 1
25.0%
3 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 1
25.0%
0 1
25.0%
1 1
25.0%
3 1
25.0%

verbatimLatitude
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing478706
Missing (%)> 99.9%
Memory size3.7 MiB
2025-01-23T18:14:11.691027image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters11
Distinct characters9
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowLatlong.net
ValueCountFrequency (%)
latlong.net 1
100.0%
2025-01-23T18:14:11.899808image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 2
18.2%
n 2
18.2%
L 1
9.1%
a 1
9.1%
l 1
9.1%
o 1
9.1%
g 1
9.1%
. 1
9.1%
e 1
9.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9
81.8%
Uppercase Letter 1
 
9.1%
Other Punctuation 1
 
9.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 2
22.2%
n 2
22.2%
a 1
11.1%
l 1
11.1%
o 1
11.1%
g 1
11.1%
e 1
11.1%
Uppercase Letter
ValueCountFrequency (%)
L 1
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10
90.9%
Common 1
 
9.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 2
20.0%
n 2
20.0%
L 1
10.0%
a 1
10.0%
l 1
10.0%
o 1
10.0%
g 1
10.0%
e 1
10.0%
Common
ValueCountFrequency (%)
. 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 2
18.2%
n 2
18.2%
L 1
9.1%
a 1
9.1%
l 1
9.1%
o 1
9.1%
g 1
9.1%
. 1
9.1%
e 1
9.1%

georeferencedBy
Text

Missing 

Distinct90
Distinct (%)< 0.1%
Missing176818
Missing (%)36.9%
Memory size3.7 MiB
2025-01-23T18:14:11.972863image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length89
Median length70
Mean length70.94355541
Min length8

Characters and Unicode

Total characters21417079
Distinct characters54
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)< 0.1%

Sample

1st rowJoy Barriball : Field Museum of Natural History - Department of Zoology
2nd rowColin Bailey : Field Museum of Natural History - Department of Zoology
3rd rowRachel Hill : Field Museum of Natural History - Exhibitions
4th rowColin Bailey : Field Museum of Natural History - Department of Zoology
5th rowColin Bailey : Field Museum of Natural History - Department of Zoology
ValueCountFrequency (%)
602337
16.5%
of 601039
16.4%
museum 300930
8.2%
natural 300516
8.2%
history 300516
8.2%
field 300515
8.2%
zoology 300381
8.2%
department 300293
8.2%
colin 183361
 
5.0%
bailey 183361
 
5.0%
Other values (227) 287955
7.9%
2025-01-23T18:14:12.128399image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3359315
15.7%
o 2043540
 
9.5%
e 1579786
 
7.4%
l 1361894
 
6.4%
t 1275958
 
6.0%
a 1272733
 
5.9%
i 1093944
 
5.1%
r 1030616
 
4.8%
u 910733
 
4.3%
y 828132
 
3.9%
Other values (44) 6660428
31.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 14950781
69.8%
Space Separator 3359315
 
15.7%
Uppercase Letter 2458152
 
11.5%
Other Punctuation 347734
 
1.6%
Dash Punctuation 301097
 
1.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 2043540
13.7%
e 1579786
10.6%
l 1361894
9.1%
t 1275958
8.5%
a 1272733
8.5%
i 1093944
 
7.3%
r 1030616
 
6.9%
u 910733
 
6.1%
y 828132
 
5.5%
s 626837
 
4.2%
Other values (16) 2926608
19.6%
Uppercase Letter
ValueCountFrequency (%)
M 304607
12.4%
D 304574
12.4%
N 304083
12.4%
F 302408
12.3%
H 301250
12.3%
Z 300512
12.2%
B 231289
9.4%
C 187518
7.6%
S 53496
 
2.2%
W 47303
 
1.9%
Other values (14) 121112
 
4.9%
Other Punctuation
ValueCountFrequency (%)
: 301245
86.6%
. 46489
 
13.4%
Space Separator
ValueCountFrequency (%)
3359315
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 301097
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 17408933
81.3%
Common 4008146
 
18.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 2043540
 
11.7%
e 1579786
 
9.1%
l 1361894
 
7.8%
t 1275958
 
7.3%
a 1272733
 
7.3%
i 1093944
 
6.3%
r 1030616
 
5.9%
u 910733
 
5.2%
y 828132
 
4.8%
s 626837
 
3.6%
Other values (40) 5384760
30.9%
Common
ValueCountFrequency (%)
3359315
83.8%
: 301245
 
7.5%
- 301097
 
7.5%
. 46489
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21417078
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3359315
15.7%
o 2043540
 
9.5%
e 1579786
 
7.4%
l 1361894
 
6.4%
t 1275958
 
6.0%
a 1272733
 
5.9%
i 1093944
 
5.1%
r 1030616
 
4.8%
u 910733
 
4.3%
y 828132
 
3.9%
Other values (43) 6660427
31.1%
None
ValueCountFrequency (%)
ã 1
100.0%

georeferencedDate
Text

Missing 

Distinct559
Distinct (%)0.2%
Missing180845
Missing (%)37.8%
Memory size3.7 MiB
2025-01-23T18:14:12.320887image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length9.376174201
Min length4

Characters and Unicode

Total characters2792806
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique72 ?
Unique (%)< 0.1%

Sample

1st row2018-10-10
2nd row2018-08-27
3rd row2017-10-19
4th row2018-09-19
5th row2019-05-30
ValueCountFrequency (%)
2013 28757
 
9.7%
2018-03-28 9005
 
3.0%
2018-07-02 4882
 
1.6%
2019-04-26 4121
 
1.4%
2018-04-02 3566
 
1.2%
2018-04-10 3449
 
1.2%
2018-02-26 3335
 
1.1%
2018-06-19 3097
 
1.0%
2018-11-15 2972
 
1.0%
2018-08-27 2944
 
1.0%
Other values (549) 231734
77.8%
2025-01-23T18:14:12.590263image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 639258
22.9%
- 533910
19.1%
1 489477
17.5%
2 461811
16.5%
8 244728
 
8.8%
3 97201
 
3.5%
9 93673
 
3.4%
4 68654
 
2.5%
6 63816
 
2.3%
7 50767
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2258896
80.9%
Dash Punctuation 533910
 
19.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 639258
28.3%
1 489477
21.7%
2 461811
20.4%
8 244728
 
10.8%
3 97201
 
4.3%
9 93673
 
4.1%
4 68654
 
3.0%
6 63816
 
2.8%
7 50767
 
2.2%
5 49511
 
2.2%
Dash Punctuation
ValueCountFrequency (%)
- 533910
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2792806
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 639258
22.9%
- 533910
19.1%
1 489477
17.5%
2 461811
16.5%
8 244728
 
8.8%
3 97201
 
3.5%
9 93673
 
3.4%
4 68654
 
2.5%
6 63816
 
2.3%
7 50767
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2792806
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 639258
22.9%
- 533910
19.1%
1 489477
17.5%
2 461811
16.5%
8 244728
 
8.8%
3 97201
 
3.5%
9 93673
 
3.4%
4 68654
 
2.5%
6 63816
 
2.3%
7 50767
 
1.8%

georeferenceProtocol
Text

Missing 

Distinct49
Distinct (%)< 0.1%
Missing163939
Missing (%)34.2%
Memory size3.7 MiB
2025-01-23T18:14:12.680351image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length50
Median length9
Mean length9.339475423
Min length3

Characters and Unicode

Total characters2939768
Distinct characters60
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)< 0.1%

Sample

1st rowGeoLocate
2nd rowGeoLocate
3rd rowGeoLocate
4th rowGeoLocate
5th rowmap
ValueCountFrequency (%)
geolocate 266245
79.5%
latlong.net 30510
 
9.1%
map 7324
 
2.2%
gps 5722
 
1.7%
garmin 4810
 
1.4%
etrex 3374
 
1.0%
earth 3366
 
1.0%
summit 3357
 
1.0%
google 3322
 
1.0%
gps75 1436
 
0.4%
Other values (70) 5610
 
1.7%
2025-01-23T18:14:12.821801image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 575247
19.6%
o 573100
19.5%
t 335540
11.4%
a 313963
10.7%
L 297121
10.1%
G 282342
9.6%
c 266668
9.1%
n 67135
 
2.3%
g 36016
 
1.2%
l 34898
 
1.2%
Other values (50) 157738
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2266062
77.1%
Uppercase Letter 610641
 
20.8%
Other Punctuation 36096
 
1.2%
Space Separator 20308
 
0.7%
Decimal Number 6580
 
0.2%
Dash Punctuation 47
 
< 0.1%
Math Symbol 34
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 575247
25.4%
o 573100
25.3%
t 335540
14.8%
a 313963
13.9%
c 266668
11.8%
n 67135
 
3.0%
g 36016
 
1.6%
l 34898
 
1.5%
m 18882
 
0.8%
r 13270
 
0.6%
Other values (13) 31343
 
1.4%
Uppercase Letter
ValueCountFrequency (%)
L 297121
48.7%
G 282342
46.2%
S 11410
 
1.9%
P 6978
 
1.1%
T 3560
 
0.6%
E 3410
 
0.6%
N 1125
 
0.2%
W 986
 
0.2%
M 868
 
0.1%
B 429
 
0.1%
Other values (12) 2412
 
0.4%
Decimal Number
ValueCountFrequency (%)
5 1649
25.1%
7 1436
21.8%
0 1021
15.5%
4 967
14.7%
8 962
14.6%
1 482
 
7.3%
2 58
 
0.9%
9 5
 
0.1%
Other Punctuation
ValueCountFrequency (%)
. 30547
84.6%
: 5289
 
14.7%
/ 192
 
0.5%
, 68
 
0.2%
Space Separator
ValueCountFrequency (%)
20308
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 47
100.0%
Math Symbol
ValueCountFrequency (%)
+ 34
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2876703
97.9%
Common 63065
 
2.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 575247
20.0%
o 573100
19.9%
t 335540
11.7%
a 313963
10.9%
L 297121
10.3%
G 282342
9.8%
c 266668
9.3%
n 67135
 
2.3%
g 36016
 
1.3%
l 34898
 
1.2%
Other values (35) 94673
 
3.3%
Common
ValueCountFrequency (%)
. 30547
48.4%
20308
32.2%
: 5289
 
8.4%
5 1649
 
2.6%
7 1436
 
2.3%
0 1021
 
1.6%
4 967
 
1.5%
8 962
 
1.5%
1 482
 
0.8%
/ 192
 
0.3%
Other values (5) 212
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2939768
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 575247
19.6%
o 573100
19.5%
t 335540
11.4%
a 313963
10.7%
L 297121
10.1%
G 282342
9.6%
c 266668
9.1%
n 67135
 
2.3%
g 36016
 
1.2%
l 34898
 
1.2%
Other values (50) 157738
 
5.4%

georeferenceSources
Text

Missing 

Distinct46
Distinct (%)0.9%
Missing473327
Missing (%)98.9%
Memory size3.7 MiB
2025-01-23T18:14:12.925253image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length200
Median length9
Mean length10.05167286
Min length5

Characters and Unicode

Total characters54078
Distinct characters72
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)0.2%

Sample

1st rowGEOLocate
2nd rowGEOLocate
3rd rowCollector
4th rowCollector
5th rowCollector
ValueCountFrequency (%)
collector 3740
55.2%
geolocate 757
 
11.2%
label 415
 
6.1%
google 143
 
2.1%
earth 139
 
2.1%
90 82
 
1.2%
06 82
 
1.2%
22 82
 
1.2%
w 82
 
1.2%
usgs 49
 
0.7%
Other values (136) 1207
 
17.8%
2025-01-23T18:14:13.109470image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 9229
17.1%
l 8848
16.4%
e 5775
10.7%
t 5086
9.4%
c 4743
8.8%
r 4287
7.9%
C 3769
7.0%
a 1947
 
3.6%
1400
 
2.6%
G 1173
 
2.2%
Other values (62) 7821
14.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 43064
79.6%
Uppercase Letter 7910
 
14.6%
Space Separator 1400
 
2.6%
Decimal Number 1184
 
2.2%
Other Punctuation 336
 
0.6%
Open Punctuation 70
 
0.1%
Close Punctuation 70
 
0.1%
Dash Punctuation 23
 
< 0.1%
Math Symbol 21
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 9229
21.4%
l 8848
20.5%
e 5775
13.4%
t 5086
11.8%
c 4743
11.0%
r 4287
10.0%
a 1947
 
4.5%
b 433
 
1.0%
s 352
 
0.8%
i 334
 
0.8%
Other values (19) 2030
 
4.7%
Uppercase Letter
ValueCountFrequency (%)
C 3769
47.6%
G 1173
 
14.8%
E 857
 
10.8%
O 728
 
9.2%
L 564
 
7.1%
N 214
 
2.7%
T 177
 
2.2%
S 167
 
2.1%
U 52
 
0.7%
I 49
 
0.6%
Other values (13) 160
 
2.0%
Decimal Number
ValueCountFrequency (%)
0 506
42.7%
2 321
27.1%
9 85
 
7.2%
6 82
 
6.9%
3 77
 
6.5%
1 59
 
5.0%
4 48
 
4.1%
5 3
 
0.3%
7 3
 
0.3%
Other Punctuation
ValueCountFrequency (%)
. 109
32.4%
, 69
20.5%
/ 61
18.2%
: 52
15.5%
; 32
 
9.5%
& 13
 
3.9%
Space Separator
ValueCountFrequency (%)
1400
100.0%
Open Punctuation
ValueCountFrequency (%)
( 70
100.0%
Close Punctuation
ValueCountFrequency (%)
) 70
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 23
100.0%
Math Symbol
ValueCountFrequency (%)
+ 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 50974
94.3%
Common 3104
 
5.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 9229
18.1%
l 8848
17.4%
e 5775
11.3%
t 5086
10.0%
c 4743
9.3%
r 4287
8.4%
C 3769
7.4%
a 1947
 
3.8%
G 1173
 
2.3%
E 857
 
1.7%
Other values (42) 5260
10.3%
Common
ValueCountFrequency (%)
1400
45.1%
0 506
 
16.3%
2 321
 
10.3%
. 109
 
3.5%
9 85
 
2.7%
6 82
 
2.6%
3 77
 
2.5%
( 70
 
2.3%
) 70
 
2.3%
, 69
 
2.2%
Other values (10) 315
 
10.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 54068
> 99.9%
None 10
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 9229
17.1%
l 8848
16.4%
e 5775
10.7%
t 5086
9.4%
c 4743
8.8%
r 4287
7.9%
C 3769
7.0%
a 1947
 
3.6%
1400
 
2.6%
G 1173
 
2.2%
Other values (58) 7811
14.4%
None
ValueCountFrequency (%)
í 4
40.0%
á 2
20.0%
ú 2
20.0%
ó 2
20.0%

georeferenceRemarks
Text

Missing 

Distinct2234
Distinct (%)6.4%
Missing443963
Missing (%)92.7%
Memory size3.7 MiB
2025-01-23T18:14:13.234279image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length752
Median length401
Mean length130.1087382
Min length1

Characters and Unicode

Total characters4520498
Distinct characters99
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique755 ?
Unique (%)2.2%

Sample

1st rowDetermined By volunteers at WeDigBio event October 2017
2nd rowCoordinates are based on the map provided by: http://texasento.net/Esperanza.html
3rd rowCoordinates and extent of radius for Benedict Prairie are based on the locality description provided by: https://dc.uwm.edu/cgi/viewcontent.cgi?referer=https://scholar.google.com/&httpsredir=1&article=1002&context=fieldstation_bulletins
4th rowCoordinates based off directions provided by:| https://watermark.silverchair.com/3-3-350.pdf?token=AQECAHi208BE49Ooan9kkhW_Ercy7Dm3ZL_9Cf3qfKAc485ysgAAAcAwggG8BgkqhkiG9w0BBwagggGtMIIBqQIBADCCAaIGCSqGSIb3DQEHATAeBglghkgBZQMEAS4wEQQMjMEzQvmmZ_7xosR2AgEQgIIBcwEzvn0K0yGCfDOhgMN7msNepVKoJWKxCkfKevCnwyAlMDr5hWTsHJDxRV2aRdZl89eDLGMzeWR74fV1EFmPysWyrHI3krvmqOQvKqGxmAai2AuUDygEH9xJe5iZUgVXxqvULKGCIBJWQVJt4ZePWCbmzvMQEZXy3iqTuDH0YjgNCiCedDsVOrXkoRWf0W-HaM9y8fxUgkylOUrY3I4qCg79G5Msa8ljUbA5vS1R4DqVyvXZrtvpjoKY43Ji68TfjLd4NE4AECEKCRng_aCZTKlb_BNMb-usHzVd8b2jmKvTs9XVSD5oSRlZZjKc0PUutMtbWjxpCsXGPWtEpZ5sQqL6oBkyvwiCzt3N7o-riUmZwtiYaB43vIOcVXwDPIhoqJCcGDR3WbN5DIkCqflEU67JL11UxZl8QoENaYnhM9_Pw7uCBSoTFmoCyiNTPGh6Btd7RksKwa35DpekSCHuiuzcnSCRyPsDE4LIp9SkUJW88kE1
5th rowUnable to locate West Hemphill using Geolocate, Google Earth Pro, or online sources. There is a cemetary in Pope County named Hempville Cemetery located at (37.316571°, -88.540302°). Coordinates and extent of the radius of uncertainty are based on the entirety of Pope County.
ValueCountFrequency (%)
the 39061
 
7.0%
of 34661
 
6.2%
are 16937
 
3.0%
based 16757
 
3.0%
coordinates 15907
 
2.8%
on 14808
 
2.6%
and 13847
 
2.5%
radius 10656
 
1.9%
is 10273
 
1.8%
in 9025
 
1.6%
Other values (5003) 378155
67.5%
2025-01-23T18:14:13.445102image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
527020
 
11.7%
e 374365
 
8.3%
a 298511
 
6.6%
o 297893
 
6.6%
t 255411
 
5.7%
i 243047
 
5.4%
n 239213
 
5.3%
r 229227
 
5.1%
s 183534
 
4.1%
d 137516
 
3.0%
Other values (89) 1734761
38.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3205941
70.9%
Space Separator 527020
 
11.7%
Uppercase Letter 385746
 
8.5%
Other Punctuation 173744
 
3.8%
Decimal Number 171053
 
3.8%
Math Symbol 27522
 
0.6%
Dash Punctuation 15964
 
0.4%
Connector Punctuation 12239
 
0.3%
Close Punctuation 583
 
< 0.1%
Open Punctuation 583
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 374365
11.7%
a 298511
 
9.3%
o 297893
 
9.3%
t 255411
 
8.0%
i 243047
 
7.6%
n 239213
 
7.5%
r 229227
 
7.2%
s 183534
 
5.7%
d 137516
 
4.3%
l 123990
 
3.9%
Other values (26) 823234
25.7%
Uppercase Letter
ValueCountFrequency (%)
C 61614
 
16.0%
A 21705
 
5.6%
P 18659
 
4.8%
S 18537
 
4.8%
E 18335
 
4.8%
D 18185
 
4.7%
G 17457
 
4.5%
L 17122
 
4.4%
R 16766
 
4.3%
B 16473
 
4.3%
Other values (16) 160893
41.7%
Other Punctuation
ValueCountFrequency (%)
. 60160
34.6%
/ 43569
25.1%
, 20472
 
11.8%
: 16390
 
9.4%
& 13823
 
8.0%
% 10195
 
5.9%
? 3471
 
2.0%
@ 2362
 
1.4%
# 1146
 
0.7%
' 985
 
0.6%
Other values (5) 1171
 
0.7%
Decimal Number
ValueCountFrequency (%)
0 32760
19.2%
2 29673
17.3%
1 22333
13.1%
3 16680
9.8%
8 15091
8.8%
6 12391
 
7.2%
4 12286
 
7.2%
5 11207
 
6.6%
9 9530
 
5.6%
7 9102
 
5.3%
Math Symbol
ValueCountFrequency (%)
= 17816
64.7%
+ 4951
 
18.0%
| 3955
 
14.4%
~ 800
 
2.9%
Close Punctuation
ValueCountFrequency (%)
) 574
98.5%
] 9
 
1.5%
Open Punctuation
ValueCountFrequency (%)
( 574
98.5%
[ 9
 
1.5%
Space Separator
ValueCountFrequency (%)
527020
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 15964
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 12239
100.0%
Other Symbol
ValueCountFrequency (%)
° 103
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3591687
79.5%
Common 928811
 
20.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 374365
 
10.4%
a 298511
 
8.3%
o 297893
 
8.3%
t 255411
 
7.1%
i 243047
 
6.8%
n 239213
 
6.7%
r 229227
 
6.4%
s 183534
 
5.1%
d 137516
 
3.8%
l 123990
 
3.5%
Other values (52) 1208980
33.7%
Common
ValueCountFrequency (%)
527020
56.7%
. 60160
 
6.5%
/ 43569
 
4.7%
0 32760
 
3.5%
2 29673
 
3.2%
1 22333
 
2.4%
, 20472
 
2.2%
= 17816
 
1.9%
3 16680
 
1.8%
: 16390
 
1.8%
Other values (27) 141938
 
15.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4517804
99.9%
None 2694
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
527020
 
11.7%
e 374365
 
8.3%
a 298511
 
6.6%
o 297893
 
6.6%
t 255411
 
5.7%
i 243047
 
5.4%
n 239213
 
5.3%
r 229227
 
5.1%
s 183534
 
4.1%
d 137516
 
3.0%
Other values (77) 1732067
38.3%
None
ValueCountFrequency (%)
é 655
24.3%
á 653
24.2%
í 498
18.5%
ä 373
13.8%
ó 310
11.5%
° 103
 
3.8%
ñ 58
 
2.2%
ú 35
 
1.3%
â 3
 
0.1%
¡ 3
 
0.1%
Other values (2) 3
 
0.1%

identificationQualifier
Text

Constant  Missing 

Distinct1
Distinct (%)0.2%
Missing478083
Missing (%)99.9%
Memory size3.7 MiB
2025-01-23T18:14:13.497601image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1872
Distinct characters3
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowcf.
2nd rowcf.
3rd rowcf.
4th rowcf.
5th rowcf.
ValueCountFrequency (%)
cf 624
100.0%
2025-01-23T18:14:13.596211image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 624
33.3%
f 624
33.3%
. 624
33.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1248
66.7%
Other Punctuation 624
33.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 624
50.0%
f 624
50.0%
Other Punctuation
ValueCountFrequency (%)
. 624
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1248
66.7%
Common 624
33.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 624
50.0%
f 624
50.0%
Common
ValueCountFrequency (%)
. 624
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1872
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 624
33.3%
f 624
33.3%
. 624
33.3%

typeStatus
Text

Missing 

Distinct5
Distinct (%)0.5%
Missing477602
Missing (%)99.8%
Memory size3.7 MiB
2025-01-23T18:14:13.645320image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length11
Mean length10.67873303
Min length8

Characters and Unicode

Total characters11800
Distinct characters16
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rowParatype(s)
2nd rowParatype(s)
3rd rowParatype(s)
4th rowParatype(s)
5th rowParatype(s)
ValueCountFrequency (%)
paratype(s 986
89.2%
holotype 106
 
9.6%
allotype 8
 
0.7%
paratype 4
 
0.4%
syntype(s 1
 
0.1%
2025-01-23T18:14:13.764293image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1980
16.8%
y 1106
9.4%
t 1105
9.4%
p 1105
9.4%
e 1105
9.4%
P 990
8.4%
r 990
8.4%
( 987
8.4%
s 987
8.4%
) 987
8.4%
Other values (6) 458
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8721
73.9%
Uppercase Letter 1105
 
9.4%
Open Punctuation 987
 
8.4%
Close Punctuation 987
 
8.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1980
22.7%
y 1106
12.7%
t 1105
12.7%
p 1105
12.7%
e 1105
12.7%
r 990
11.4%
s 987
11.3%
o 220
 
2.5%
l 122
 
1.4%
n 1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
P 990
89.6%
H 106
 
9.6%
A 8
 
0.7%
S 1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 987
100.0%
Close Punctuation
ValueCountFrequency (%)
) 987
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9826
83.3%
Common 1974
 
16.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1980
20.2%
y 1106
11.3%
t 1105
11.2%
p 1105
11.2%
e 1105
11.2%
P 990
10.1%
r 990
10.1%
s 987
10.0%
o 220
 
2.2%
l 122
 
1.2%
Other values (4) 116
 
1.2%
Common
ValueCountFrequency (%)
( 987
50.0%
) 987
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1980
16.8%
y 1106
9.4%
t 1105
9.4%
p 1105
9.4%
e 1105
9.4%
P 990
8.4%
r 990
8.4%
( 987
8.4%
s 987
8.4%
) 987
8.4%
Other values (6) 458
 
3.9%

identifiedBy
Text

Missing 

Distinct1519
Distinct (%)0.6%
Missing243207
Missing (%)50.8%
Memory size3.7 MiB
2025-01-23T18:14:13.947160image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length122
Median length112
Mean length63.4650828
Min length3

Characters and Unicode

Total characters14946027
Distinct characters81
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique397 ?
Unique (%)0.2%

Sample

1st rowDr. Harry G. Nelson : Field Museum of Natural History - Department of Zoology
2nd rowDr. Frank N. Young : Indiana University - Department of Biology
3rd rowRobin Delapena : Field Museum of Natural History - Invertebrates - Zoology
4th rowH. S. Barber
5th rowDr. Rupert L. Wenzel : Field Museum of Natural History - Department of Zoology
ValueCountFrequency (%)
371002
 
14.4%
of 276549
 
10.7%
dr 168339
 
6.5%
department 149766
 
5.8%
zoology 96156
 
3.7%
museum 95895
 
3.7%
natural 90971
 
3.5%
history 90277
 
3.5%
field 85352
 
3.3%
university 75306
 
2.9%
Other values (2266) 1079178
41.8%
2025-01-23T18:14:14.221928image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2343291
15.7%
o 1158437
 
7.8%
e 1085878
 
7.3%
r 983736
 
6.6%
t 846102
 
5.7%
a 796691
 
5.3%
n 645420
 
4.3%
i 632358
 
4.2%
l 625736
 
4.2%
s 472855
 
3.2%
Other values (71) 5355523
35.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9909548
66.3%
Space Separator 2343291
 
15.7%
Uppercase Letter 1912654
 
12.8%
Other Punctuation 602086
 
4.0%
Dash Punctuation 177804
 
1.2%
Open Punctuation 289
 
< 0.1%
Close Punctuation 289
 
< 0.1%
Decimal Number 66
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 1158437
11.7%
e 1085878
11.0%
r 983736
9.9%
t 846102
 
8.5%
a 796691
 
8.0%
n 645420
 
6.5%
i 632358
 
6.4%
l 625736
 
6.3%
s 472855
 
4.8%
u 422895
 
4.3%
Other values (33) 2239440
22.6%
Uppercase Letter
ValueCountFrequency (%)
D 344232
18.0%
N 157031
 
8.2%
M 142589
 
7.5%
F 132131
 
6.9%
H 127984
 
6.7%
Z 98854
 
5.2%
S 91368
 
4.8%
U 80308
 
4.2%
E 75294
 
3.9%
C 71992
 
3.8%
Other values (16) 590871
30.9%
Other Punctuation
ValueCountFrequency (%)
. 402318
66.8%
: 193476
32.1%
, 3452
 
0.6%
& 1304
 
0.2%
/ 1006
 
0.2%
' 500
 
0.1%
? 30
 
< 0.1%
Space Separator
ValueCountFrequency (%)
2343291
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 177804
100.0%
Open Punctuation
ValueCountFrequency (%)
( 289
100.0%
Close Punctuation
ValueCountFrequency (%)
) 289
100.0%
Decimal Number
ValueCountFrequency (%)
2 66
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11822202
79.1%
Common 3123825
 
20.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 1158437
 
9.8%
e 1085878
 
9.2%
r 983736
 
8.3%
t 846102
 
7.2%
a 796691
 
6.7%
n 645420
 
5.5%
i 632358
 
5.3%
l 625736
 
5.3%
s 472855
 
4.0%
u 422895
 
3.6%
Other values (59) 4152094
35.1%
Common
ValueCountFrequency (%)
2343291
75.0%
. 402318
 
12.9%
: 193476
 
6.2%
- 177804
 
5.7%
, 3452
 
0.1%
& 1304
 
< 0.1%
/ 1006
 
< 0.1%
' 500
 
< 0.1%
( 289
 
< 0.1%
) 289
 
< 0.1%
Other values (2) 96
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14939532
> 99.9%
None 6495
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2343291
15.7%
o 1158437
 
7.8%
e 1085878
 
7.3%
r 983736
 
6.6%
t 846102
 
5.7%
a 796691
 
5.3%
n 645420
 
4.3%
i 632358
 
4.2%
l 625736
 
4.2%
s 472855
 
3.2%
Other values (53) 5349028
35.8%
None
ValueCountFrequency (%)
á 1752
27.0%
é 1323
20.4%
í 1270
19.6%
è 494
 
7.6%
ä 478
 
7.4%
ç 338
 
5.2%
ã 179
 
2.8%
ž 175
 
2.7%
ö 136
 
2.1%
ó 125
 
1.9%
Other values (8) 225
 
3.5%

dateIdentified
Text

Missing 

Distinct160
Distinct (%)0.1%
Missing342076
Missing (%)71.5%
Memory size3.7 MiB
2025-01-23T18:14:14.370433image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length3.999970724
Min length3

Characters and Unicode

Total characters546520
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)< 0.1%

Sample

1st row2014
2nd row1988
3rd row1975
4th row2013
5th row1985
ValueCountFrequency (%)
1976 11710
 
8.6%
1972 5726
 
4.2%
2013 5108
 
3.7%
2018 4791
 
3.5%
2014 4552
 
3.3%
1971 4445
 
3.3%
1998 4099
 
3.0%
2007 3777
 
2.8%
2011 3689
 
2.7%
1975 3646
 
2.7%
Other values (150) 85088
62.3%
2025-01-23T18:14:14.578427image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 128170
23.5%
9 105736
19.3%
0 86463
15.8%
2 75410
13.8%
7 47461
 
8.7%
6 28162
 
5.2%
8 25177
 
4.6%
5 18067
 
3.3%
4 16050
 
2.9%
3 15824
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 546520
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 128170
23.5%
9 105736
19.3%
0 86463
15.8%
2 75410
13.8%
7 47461
 
8.7%
6 28162
 
5.2%
8 25177
 
4.6%
5 18067
 
3.3%
4 16050
 
2.9%
3 15824
 
2.9%

Most occurring scripts

ValueCountFrequency (%)
Common 546520
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 128170
23.5%
9 105736
19.3%
0 86463
15.8%
2 75410
13.8%
7 47461
 
8.7%
6 28162
 
5.2%
8 25177
 
4.6%
5 18067
 
3.3%
4 16050
 
2.9%
3 15824
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 546520
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 128170
23.5%
9 105736
19.3%
0 86463
15.8%
2 75410
13.8%
7 47461
 
8.7%
6 28162
 
5.2%
8 25177
 
4.6%
5 18067
 
3.3%
4 16050
 
2.9%
3 15824
 
2.9%

identificationRemarks
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing478706
Missing (%)> 99.9%
Memory size3.7 MiB
2025-01-23T18:14:14.639343image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length26
Median length26
Mean length26
Min length26

Characters and Unicode

Total characters26
Distinct characters20
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowHexacylloepus Hinton, 1940
ValueCountFrequency (%)
hexacylloepus 1
33.3%
hinton 1
33.3%
1940 1
33.3%
2025-01-23T18:14:14.744203image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
H 2
 
7.7%
l 2
 
7.7%
o 2
 
7.7%
n 2
 
7.7%
e 2
 
7.7%
2
 
7.7%
i 1
 
3.8%
4 1
 
3.8%
9 1
 
3.8%
1 1
 
3.8%
Other values (10) 10
38.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 17
65.4%
Decimal Number 4
 
15.4%
Uppercase Letter 2
 
7.7%
Space Separator 2
 
7.7%
Other Punctuation 1
 
3.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 2
11.8%
o 2
11.8%
n 2
11.8%
e 2
11.8%
i 1
 
5.9%
t 1
 
5.9%
s 1
 
5.9%
u 1
 
5.9%
p 1
 
5.9%
y 1
 
5.9%
Other values (3) 3
17.6%
Decimal Number
ValueCountFrequency (%)
4 1
25.0%
9 1
25.0%
1 1
25.0%
0 1
25.0%
Uppercase Letter
ValueCountFrequency (%)
H 2
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 19
73.1%
Common 7
 
26.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
H 2
10.5%
l 2
10.5%
o 2
10.5%
n 2
10.5%
e 2
10.5%
i 1
 
5.3%
t 1
 
5.3%
s 1
 
5.3%
u 1
 
5.3%
p 1
 
5.3%
Other values (4) 4
21.1%
Common
ValueCountFrequency (%)
2
28.6%
4 1
14.3%
9 1
14.3%
1 1
14.3%
, 1
14.3%
0 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
H 2
 
7.7%
l 2
 
7.7%
o 2
 
7.7%
n 2
 
7.7%
e 2
 
7.7%
2
 
7.7%
i 1
 
3.8%
4 1
 
3.8%
9 1
 
3.8%
1 1
 
3.8%
Other values (10) 10
38.5%

namePublishedInID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing478706
Missing (%)> 99.9%
Memory size3.7 MiB
2025-01-23T18:14:14.798224image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length46
Median length46
Mean length46
Min length46

Characters and Unicode

Total characters46
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowAnimalia Arthropoda Insecta Coleoptera Elmidae
ValueCountFrequency (%)
animalia 1
20.0%
arthropoda 1
20.0%
insecta 1
20.0%
coleoptera 1
20.0%
elmidae 1
20.0%
2025-01-23T18:14:14.908165image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 6
13.0%
e 4
 
8.7%
4
 
8.7%
o 4
 
8.7%
i 3
 
6.5%
l 3
 
6.5%
r 3
 
6.5%
t 3
 
6.5%
A 2
 
4.3%
m 2
 
4.3%
Other values (9) 12
26.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 37
80.4%
Uppercase Letter 5
 
10.9%
Space Separator 4
 
8.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6
16.2%
e 4
10.8%
o 4
10.8%
i 3
8.1%
l 3
8.1%
r 3
8.1%
t 3
8.1%
m 2
 
5.4%
n 2
 
5.4%
p 2
 
5.4%
Other values (4) 5
13.5%
Uppercase Letter
ValueCountFrequency (%)
A 2
40.0%
C 1
20.0%
I 1
20.0%
E 1
20.0%
Space Separator
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 42
91.3%
Common 4
 
8.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6
14.3%
e 4
 
9.5%
o 4
 
9.5%
i 3
 
7.1%
l 3
 
7.1%
r 3
 
7.1%
t 3
 
7.1%
A 2
 
4.8%
m 2
 
4.8%
n 2
 
4.8%
Other values (8) 10
23.8%
Common
ValueCountFrequency (%)
4
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 46
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 6
13.0%
e 4
 
8.7%
4
 
8.7%
o 4
 
8.7%
i 3
 
6.5%
l 3
 
6.5%
r 3
 
6.5%
t 3
 
6.5%
A 2
 
4.3%
m 2
 
4.3%
Other values (9) 12
26.1%

taxonConceptID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing478706
Missing (%)> 99.9%
Memory size3.7 MiB
2025-01-23T18:14:14.957674image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowAnimalia
ValueCountFrequency (%)
animalia 1
100.0%
2025-01-23T18:14:15.059636image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 2
25.0%
a 2
25.0%
A 1
12.5%
n 1
12.5%
m 1
12.5%
l 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7
87.5%
Uppercase Letter 1
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 2
28.6%
a 2
28.6%
n 1
14.3%
m 1
14.3%
l 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 2
25.0%
a 2
25.0%
A 1
12.5%
n 1
12.5%
m 1
12.5%
l 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 2
25.0%
a 2
25.0%
A 1
12.5%
n 1
12.5%
m 1
12.5%
l 1
12.5%

scientificName
Text

Missing 

Distinct31889
Distinct (%)7.0%
Missing23187
Missing (%)4.8%
Memory size3.7 MiB
2025-01-23T18:14:15.260065image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length90
Median length65
Mean length31.4784971
Min length5

Characters and Unicode

Total characters14339085
Distinct characters96
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16477 ?
Unique (%)3.6%

Sample

1st rowByrrhoidea Latreille, 1804
2nd rowNeoporus clypealis (Sharp, 1882)
3rd rowCeratocombidae
4th rowTropisternus (Tropisternus) lateralis nimbatus (Say, 1823)
5th rowParatenetus Spinola, 1844
ValueCountFrequency (%)
leconte 31571
 
1.9%
latreille 26922
 
1.6%
say 24028
 
1.4%
leach 15069
 
0.9%
fabricius 14401
 
0.8%
linnaeus 13979
 
0.8%
1804 13634
 
0.8%
byrrhoidea 11793
 
0.7%
1815 11488
 
0.7%
11183
 
0.7%
Other values (28170) 1527554
89.8%
2025-01-23T18:14:15.546291image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1246328
 
8.7%
e 986941
 
6.9%
a 967263
 
6.7%
i 911885
 
6.4%
s 861276
 
6.0%
r 747676
 
5.2%
o 735566
 
5.1%
l 596834
 
4.2%
u 576893
 
4.0%
n 568102
 
4.0%
Other values (86) 6140321
42.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9724811
67.8%
Decimal Number 1570348
 
11.0%
Space Separator 1246328
 
8.7%
Uppercase Letter 996397
 
6.9%
Other Punctuation 414059
 
2.9%
Close Punctuation 191422
 
1.3%
Open Punctuation 191422
 
1.3%
Dash Punctuation 4294
 
< 0.1%
Connector Punctuation 3
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 986941
10.1%
a 967263
9.9%
i 911885
9.4%
s 861276
 
8.9%
r 747676
 
7.7%
o 735566
 
7.6%
l 596834
 
6.1%
u 576893
 
5.9%
n 568102
 
5.8%
t 552263
 
5.7%
Other values (32) 2220112
22.8%
Uppercase Letter
ValueCountFrequency (%)
C 116120
11.7%
L 114198
11.5%
S 95034
9.5%
P 75481
 
7.6%
H 71820
 
7.2%
B 71022
 
7.1%
A 61813
 
6.2%
M 58862
 
5.9%
E 56184
 
5.6%
F 42780
 
4.3%
Other values (16) 233083
23.4%
Decimal Number
ValueCountFrequency (%)
1 463339
29.5%
8 334986
21.3%
9 132858
 
8.5%
5 106136
 
6.8%
7 106045
 
6.8%
4 97695
 
6.2%
3 93392
 
5.9%
2 88597
 
5.6%
0 73924
 
4.7%
6 73376
 
4.7%
Other Punctuation
ValueCountFrequency (%)
, 392270
94.7%
& 11179
 
2.7%
. 9307
 
2.2%
" 674
 
0.2%
' 436
 
0.1%
# 104
 
< 0.1%
* 53
 
< 0.1%
? 26
 
< 0.1%
/ 9
 
< 0.1%
: 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 191408
> 99.9%
] 14
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 191408
> 99.9%
[ 14
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1246328
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4294
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10721208
74.8%
Common 3617877
 
25.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 986941
 
9.2%
a 967263
 
9.0%
i 911885
 
8.5%
s 861276
 
8.0%
r 747676
 
7.0%
o 735566
 
6.9%
l 596834
 
5.6%
u 576893
 
5.4%
n 568102
 
5.3%
t 552263
 
5.2%
Other values (58) 3216509
30.0%
Common
ValueCountFrequency (%)
1246328
34.4%
1 463339
 
12.8%
, 392270
 
10.8%
8 334986
 
9.3%
) 191408
 
5.3%
( 191408
 
5.3%
9 132858
 
3.7%
5 106136
 
2.9%
7 106045
 
2.9%
4 97695
 
2.7%
Other values (18) 355404
 
9.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14316975
99.8%
None 22110
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1246328
 
8.7%
e 986941
 
6.9%
a 967263
 
6.8%
i 911885
 
6.4%
s 861276
 
6.0%
r 747676
 
5.2%
o 735566
 
5.1%
l 596834
 
4.2%
u 576893
 
4.0%
n 568102
 
4.0%
Other values (70) 6118211
42.7%
None
ValueCountFrequency (%)
é 15066
68.1%
ü 3100
 
14.0%
ö 1050
 
4.7%
ã 1048
 
4.7%
å 499
 
2.3%
ä 465
 
2.1%
è 426
 
1.9%
á 199
 
0.9%
ç 156
 
0.7%
ñ 72
 
0.3%
Other values (6) 29
 
0.1%

acceptedNameUsage
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing478706
Missing (%)> 99.9%
Memory size3.7 MiB
2025-01-23T18:14:15.600688image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowInsecta
ValueCountFrequency (%)
insecta 1
100.0%
2025-01-23T18:14:15.697709image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 1
14.3%
n 1
14.3%
s 1
14.3%
e 1
14.3%
c 1
14.3%
t 1
14.3%
a 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6
85.7%
Uppercase Letter 1
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 1
16.7%
s 1
16.7%
e 1
16.7%
c 1
16.7%
t 1
16.7%
a 1
16.7%
Uppercase Letter
ValueCountFrequency (%)
I 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 1
14.3%
n 1
14.3%
s 1
14.3%
e 1
14.3%
c 1
14.3%
t 1
14.3%
a 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 1
14.3%
n 1
14.3%
s 1
14.3%
e 1
14.3%
c 1
14.3%
t 1
14.3%
a 1
14.3%

parentNameUsage
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing478706
Missing (%)> 99.9%
Memory size3.7 MiB
2025-01-23T18:14:15.743746image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters10
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowColeoptera
ValueCountFrequency (%)
coleoptera 1
100.0%
2025-01-23T18:14:15.845299image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 2
20.0%
e 2
20.0%
C 1
10.0%
l 1
10.0%
p 1
10.0%
t 1
10.0%
r 1
10.0%
a 1
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9
90.0%
Uppercase Letter 1
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 2
22.2%
e 2
22.2%
l 1
11.1%
p 1
11.1%
t 1
11.1%
r 1
11.1%
a 1
11.1%
Uppercase Letter
ValueCountFrequency (%)
C 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 2
20.0%
e 2
20.0%
C 1
10.0%
l 1
10.0%
p 1
10.0%
t 1
10.0%
r 1
10.0%
a 1
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 2
20.0%
e 2
20.0%
C 1
10.0%
l 1
10.0%
p 1
10.0%
t 1
10.0%
r 1
10.0%
a 1
10.0%

nameAccordingTo
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing478706
Missing (%)> 99.9%
Memory size3.7 MiB
2025-01-23T18:14:15.888471image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowElmidae
ValueCountFrequency (%)
elmidae 1
100.0%
2025-01-23T18:14:15.985049image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 1
14.3%
l 1
14.3%
m 1
14.3%
i 1
14.3%
d 1
14.3%
a 1
14.3%
e 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6
85.7%
Uppercase Letter 1
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 1
16.7%
m 1
16.7%
i 1
16.7%
d 1
16.7%
a 1
16.7%
e 1
16.7%
Uppercase Letter
ValueCountFrequency (%)
E 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 1
14.3%
l 1
14.3%
m 1
14.3%
i 1
14.3%
d 1
14.3%
a 1
14.3%
e 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 1
14.3%
l 1
14.3%
m 1
14.3%
i 1
14.3%
d 1
14.3%
a 1
14.3%
e 1
14.3%

higherClassification
Text

Missing 

Distinct734
Distinct (%)0.2%
Missing23188
Missing (%)4.8%
Memory size3.7 MiB
2025-01-23T18:14:16.167374image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length66
Median length63
Mean length48.91391797
Min length19

Characters and Unicode

Total characters22281219
Distinct characters52
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique105 ?
Unique (%)< 0.1%

Sample

1st rowAnimalia Arthropoda Insecta Coleoptera
2nd rowAnimalia Arthropoda Insecta Coleoptera Dytiscidae
3rd rowAnimalia Arthropoda Insecta Hemiptera Ceratocombidae
4th rowAnimalia Arthropoda Insecta Coleoptera Hydrophilidae
5th rowAnimalia Arthropoda Insecta Coleoptera Tenebrionidae
ValueCountFrequency (%)
animalia 455518
20.3%
arthropoda 455516
20.3%
insecta 413026
18.4%
coleoptera 248732
11.1%
hymenoptera 78942
 
3.5%
formicidae 43933
 
2.0%
histeridae 40587
 
1.8%
diptera 31203
 
1.4%
dytiscidae 30525
 
1.4%
tenebrionidae 29192
 
1.3%
Other values (705) 414789
18.5%
2025-01-23T18:14:16.430480image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2866005
12.9%
e 1891292
 
8.5%
o 1799641
 
8.1%
1786448
 
8.0%
i 1759679
 
7.9%
r 1630393
 
7.3%
t 1466097
 
6.6%
n 1136599
 
5.1%
p 1077331
 
4.8%
d 1009254
 
4.5%
Other values (42) 5858480
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18252801
81.9%
Uppercase Letter 2241958
 
10.1%
Space Separator 1786448
 
8.0%
Other Punctuation 10
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2866005
15.7%
e 1891292
10.4%
o 1799641
9.9%
i 1759679
9.6%
r 1630393
8.9%
t 1466097
8.0%
n 1136599
 
6.2%
p 1077331
 
5.9%
d 1009254
 
5.5%
l 921658
 
5.0%
Other values (15) 2694852
14.8%
Uppercase Letter
ValueCountFrequency (%)
A 967619
43.2%
I 417163
18.6%
C 285015
 
12.7%
H 161500
 
7.2%
S 89235
 
4.0%
D 88838
 
4.0%
F 44397
 
2.0%
L 41938
 
1.9%
T 38304
 
1.7%
P 27673
 
1.2%
Other values (13) 80276
 
3.6%
Space Separator
ValueCountFrequency (%)
1786448
100.0%
Other Punctuation
ValueCountFrequency (%)
? 10
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 20494759
92.0%
Common 1786460
 
8.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2866005
14.0%
e 1891292
 
9.2%
o 1799641
 
8.8%
i 1759679
 
8.6%
r 1630393
 
8.0%
t 1466097
 
7.2%
n 1136599
 
5.5%
p 1077331
 
5.3%
d 1009254
 
4.9%
A 967619
 
4.7%
Other values (38) 4890849
23.9%
Common
ValueCountFrequency (%)
1786448
> 99.9%
? 10
 
< 0.1%
( 1
 
< 0.1%
) 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 22281219
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2866005
12.9%
e 1891292
 
8.5%
o 1799641
 
8.1%
1786448
 
8.0%
i 1759679
 
7.9%
r 1630393
 
7.3%
t 1466097
 
6.6%
n 1136599
 
5.1%
p 1077331
 
4.8%
d 1009254
 
4.5%
Other values (42) 5858480
26.3%

kingdom
Text

Missing 

Distinct3
Distinct (%)< 0.1%
Missing23187
Missing (%)4.8%
Memory size3.7 MiB
2025-01-23T18:14:16.490437image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length8
Mean length8.000021953
Min length8

Characters and Unicode

Total characters3644170
Distinct characters18
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowAnimalia
2nd rowAnimalia
3rd rowAnimalia
4th rowAnimalia
5th rowAnimalia
ValueCountFrequency (%)
animalia 455518
> 99.9%
hexacylloepus 1
 
< 0.1%
chelodesminae 1
 
< 0.1%
2025-01-23T18:14:16.593953image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 911038
25.0%
i 911037
25.0%
l 455521
12.5%
m 455519
12.5%
n 455519
12.5%
A 455518
12.5%
e 5
 
< 0.1%
s 2
 
< 0.1%
o 2
 
< 0.1%
u 1
 
< 0.1%
Other values (8) 8
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3188650
87.5%
Uppercase Letter 455520
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 911038
28.6%
i 911037
28.6%
l 455521
14.3%
m 455519
14.3%
n 455519
14.3%
e 5
 
< 0.1%
s 2
 
< 0.1%
o 2
 
< 0.1%
u 1
 
< 0.1%
h 1
 
< 0.1%
Other values (5) 5
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
A 455518
> 99.9%
C 1
 
< 0.1%
H 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 3644170
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 911038
25.0%
i 911037
25.0%
l 455521
12.5%
m 455519
12.5%
n 455519
12.5%
A 455518
12.5%
e 5
 
< 0.1%
s 2
 
< 0.1%
o 2
 
< 0.1%
u 1
 
< 0.1%
Other values (8) 8
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3644170
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 911038
25.0%
i 911037
25.0%
l 455521
12.5%
m 455519
12.5%
n 455519
12.5%
A 455518
12.5%
e 5
 
< 0.1%
s 2
 
< 0.1%
o 2
 
< 0.1%
u 1
 
< 0.1%
Other values (8) 8
 
< 0.1%

phylum
Text

Constant  Missing 

Distinct1
Distinct (%)< 0.1%
Missing23191
Missing (%)4.8%
Memory size3.7 MiB
2025-01-23T18:14:16.639755image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters4555160
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowArthropoda
2nd rowArthropoda
3rd rowArthropoda
4th rowArthropoda
5th rowArthropoda
ValueCountFrequency (%)
arthropoda 455516
100.0%
2025-01-23T18:14:16.739558image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 911032
20.0%
o 911032
20.0%
A 455516
10.0%
t 455516
10.0%
h 455516
10.0%
p 455516
10.0%
d 455516
10.0%
a 455516
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4099644
90.0%
Uppercase Letter 455516
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 911032
22.2%
o 911032
22.2%
t 455516
11.1%
h 455516
11.1%
p 455516
11.1%
d 455516
11.1%
a 455516
11.1%
Uppercase Letter
ValueCountFrequency (%)
A 455516
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4555160
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 911032
20.0%
o 911032
20.0%
A 455516
10.0%
t 455516
10.0%
h 455516
10.0%
p 455516
10.0%
d 455516
10.0%
a 455516
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4555160
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 911032
20.0%
o 911032
20.0%
A 455516
10.0%
t 455516
10.0%
h 455516
10.0%
p 455516
10.0%
d 455516
10.0%
a 455516
10.0%

class
Text

Missing 

Distinct6
Distinct (%)< 0.1%
Missing23379
Missing (%)4.9%
Memory size3.7 MiB
2025-01-23T18:14:16.788319image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length7
Mean length7.185830874
Min length7

Characters and Unicode

Total characters3271910
Distinct characters20
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowInsecta
2nd rowInsecta
3rd rowInsecta
4th rowInsecta
5th rowInsecta
ValueCountFrequency (%)
insecta 413026
90.7%
arachnida 21862
 
4.8%
diplopoda 16614
 
3.6%
chilopoda 3822
 
0.8%
malacostraca 3
 
< 0.1%
entognatha 1
 
< 0.1%
2025-01-23T18:14:16.898526image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 477200
14.6%
c 434894
13.3%
n 434890
13.3%
t 413031
12.6%
s 413029
12.6%
I 413026
12.6%
e 413026
12.6%
d 42298
 
1.3%
i 42298
 
1.3%
o 40876
 
1.2%
Other values (10) 147342
 
4.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2816582
86.1%
Uppercase Letter 455328
 
13.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 477200
16.9%
c 434894
15.4%
n 434890
15.4%
t 413031
14.7%
s 413029
14.7%
e 413026
14.7%
d 42298
 
1.5%
i 42298
 
1.5%
o 40876
 
1.5%
p 37050
 
1.3%
Other values (4) 67990
 
2.4%
Uppercase Letter
ValueCountFrequency (%)
I 413026
90.7%
A 21862
 
4.8%
D 16614
 
3.6%
C 3822
 
0.8%
M 3
 
< 0.1%
E 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 3271910
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 477200
14.6%
c 434894
13.3%
n 434890
13.3%
t 413031
12.6%
s 413029
12.6%
I 413026
12.6%
e 413026
12.6%
d 42298
 
1.3%
i 42298
 
1.3%
o 40876
 
1.2%
Other values (10) 147342
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3271910
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 477200
14.6%
c 434894
13.3%
n 434890
13.3%
t 413031
12.6%
s 413029
12.6%
I 413026
12.6%
e 413026
12.6%
d 42298
 
1.3%
i 42298
 
1.3%
o 40876
 
1.2%
Other values (10) 147342
 
4.5%

order
Text

Missing 

Distinct57
Distinct (%)< 0.1%
Missing25166
Missing (%)5.3%
Memory size3.7 MiB
2025-01-23T18:14:16.977780image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length10
Mean length9.986916288
Min length6

Characters and Unicode

Total characters4529476
Distinct characters39
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowColeoptera
2nd rowColeoptera
3rd rowHemiptera
4th rowColeoptera
5th rowColeoptera
ValueCountFrequency (%)
coleoptera 248732
54.8%
hymenoptera 78942
 
17.4%
diptera 31203
 
6.9%
lepidoptera 26313
 
5.8%
araneae 18886
 
4.2%
siphonaptera 7243
 
1.6%
hemiptera 6031
 
1.3%
polydesmida 4683
 
1.0%
odonata 3282
 
0.7%
phthiraptera 2634
 
0.6%
Other values (45) 25592
 
5.6%
2025-01-23T18:14:17.249237image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 827386
18.3%
o 653542
14.4%
a 488769
10.8%
p 458095
10.1%
r 449540
9.9%
t 424265
9.4%
l 265647
 
5.9%
C 250358
 
5.5%
n 112734
 
2.5%
i 105008
 
2.3%
Other values (29) 494132
10.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4075938
90.0%
Uppercase Letter 453538
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 827386
20.3%
o 653542
16.0%
a 488769
12.0%
p 458095
11.2%
r 449540
11.0%
t 424265
10.4%
l 265647
 
6.5%
n 112734
 
2.8%
i 105008
 
2.6%
m 98755
 
2.4%
Other values (11) 192197
 
4.7%
Uppercase Letter
ValueCountFrequency (%)
C 250358
55.2%
H 84973
 
18.7%
D 32260
 
7.1%
L 27956
 
6.2%
A 18911
 
4.2%
S 15169
 
3.3%
P 10299
 
2.3%
O 3495
 
0.8%
J 2336
 
0.5%
T 2253
 
0.5%
Other values (8) 5528
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 4529476
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 827386
18.3%
o 653542
14.4%
a 488769
10.8%
p 458095
10.1%
r 449540
9.9%
t 424265
9.4%
l 265647
 
5.9%
C 250358
 
5.5%
n 112734
 
2.5%
i 105008
 
2.3%
Other values (29) 494132
10.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4529476
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 827386
18.3%
o 653542
14.4%
a 488769
10.8%
p 458095
10.1%
r 449540
9.9%
t 424265
9.4%
l 265647
 
5.9%
C 250358
 
5.5%
n 112734
 
2.5%
i 105008
 
2.3%
Other values (29) 494132
10.9%

family
Text

Missing 

Distinct652
Distinct (%)0.2%
Missing56649
Missing (%)11.8%
Memory size3.7 MiB
2025-01-23T18:14:17.445456image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length19
Mean length10.64798914
Min length6

Characters and Unicode

Total characters4494069
Distinct characters52
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique85 ?
Unique (%)< 0.1%

Sample

1st rowDytiscidae
2nd rowCeratocombidae
3rd rowHydrophilidae
4th rowTenebrionidae
5th rowHisteridae
ValueCountFrequency (%)
formicidae 43933
 
10.4%
histeridae 40587
 
9.6%
dytiscidae 30525
 
7.2%
tenebrionidae 29192
 
6.9%
hydrophilidae 24507
 
5.8%
streblidae 24125
 
5.7%
elmidae 16997
 
4.0%
scarabaeidae 16232
 
3.8%
silphidae 13815
 
3.3%
coccinellidae 13613
 
3.2%
Other values (642) 168533
39.9%
2025-01-23T18:14:17.715475image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 701336
15.6%
e 650877
14.5%
a 533483
11.9%
d 456223
10.2%
r 247956
 
5.5%
o 194190
 
4.3%
l 180053
 
4.0%
t 173285
 
3.9%
c 158366
 
3.5%
n 133456
 
3.0%
Other values (42) 1064844
23.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4071999
90.6%
Uppercase Letter 422057
 
9.4%
Other Punctuation 10
 
< 0.1%
Space Separator 1
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 701336
17.2%
e 650877
16.0%
a 533483
13.1%
d 456223
11.2%
r 247956
 
6.1%
o 194190
 
4.8%
l 180053
 
4.4%
t 173285
 
4.3%
c 158366
 
3.9%
n 133456
 
3.3%
Other values (15) 642774
15.8%
Uppercase Letter
ValueCountFrequency (%)
H 76527
18.1%
S 74066
17.5%
F 44397
10.5%
D 39964
9.5%
T 36051
8.5%
C 30834
7.3%
N 19584
 
4.6%
E 18052
 
4.3%
P 17374
 
4.1%
A 15812
 
3.7%
Other values (13) 49396
11.7%
Other Punctuation
ValueCountFrequency (%)
? 10
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4494056
> 99.9%
Common 13
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 701336
15.6%
e 650877
14.5%
a 533483
11.9%
d 456223
10.2%
r 247956
 
5.5%
o 194190
 
4.3%
l 180053
 
4.0%
t 173285
 
3.9%
c 158366
 
3.5%
n 133456
 
3.0%
Other values (38) 1064831
23.7%
Common
ValueCountFrequency (%)
? 10
76.9%
1
 
7.7%
( 1
 
7.7%
) 1
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4494069
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 701336
15.6%
e 650877
14.5%
a 533483
11.9%
d 456223
10.2%
r 247956
 
5.5%
o 194190
 
4.3%
l 180053
 
4.0%
t 173285
 
3.9%
c 158366
 
3.5%
n 133456
 
3.0%
Other values (42) 1064844
23.7%

genus
Text

Missing 

Distinct6686
Distinct (%)1.8%
Missing103664
Missing (%)21.7%
Memory size3.7 MiB
2025-01-23T18:14:17.921479image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length19
Mean length9.083681071
Min length2

Characters and Unicode

Total characters3406771
Distinct characters56
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1985 ?
Unique (%)0.5%

Sample

1st rowNeoporus
2nd rowTropisternus
3rd rowParatenetus
4th rowCarcinops
5th rowTeretriosoma
ValueCountFrequency (%)
trichobius 9061
 
2.4%
bombus 8985
 
2.4%
eleodes 7388
 
2.0%
agrilus 6940
 
1.9%
nicrophorus 6760
 
1.8%
ataenius 5428
 
1.4%
stenelmis 5388
 
1.4%
catocala 4652
 
1.2%
formica 4345
 
1.2%
cercyon 3886
 
1.0%
Other values (6674) 312222
83.2%
2025-01-23T18:14:18.191828image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 330101
 
9.7%
o 305120
 
9.0%
a 263237
 
7.7%
i 255852
 
7.5%
e 248464
 
7.3%
r 219267
 
6.4%
u 216158
 
6.3%
t 171537
 
5.0%
l 166617
 
4.9%
p 139508
 
4.1%
Other values (46) 1090910
32.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3031661
89.0%
Uppercase Letter 375042
 
11.0%
Other Punctuation 34
 
< 0.1%
Space Separator 12
 
< 0.1%
Open Punctuation 11
 
< 0.1%
Close Punctuation 11
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 330101
10.9%
o 305120
10.1%
a 263237
 
8.7%
i 255852
 
8.4%
e 248464
 
8.2%
r 219267
 
7.2%
u 216158
 
7.1%
t 171537
 
5.7%
l 166617
 
5.5%
p 139508
 
4.6%
Other values (16) 715800
23.6%
Uppercase Letter
ValueCountFrequency (%)
A 46880
12.5%
C 41889
11.2%
P 41806
11.1%
H 31176
8.3%
E 30415
8.1%
S 29718
7.9%
T 24280
 
6.5%
B 21904
 
5.8%
M 19527
 
5.2%
N 18642
 
5.0%
Other values (16) 68805
18.3%
Other Punctuation
ValueCountFrequency (%)
. 34
100.0%
Space Separator
ValueCountFrequency (%)
12
100.0%
Open Punctuation
ValueCountFrequency (%)
( 11
100.0%
Close Punctuation
ValueCountFrequency (%)
) 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3406703
> 99.9%
Common 68
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 330101
 
9.7%
o 305120
 
9.0%
a 263237
 
7.7%
i 255852
 
7.5%
e 248464
 
7.3%
r 219267
 
6.4%
u 216158
 
6.3%
t 171537
 
5.0%
l 166617
 
4.9%
p 139508
 
4.1%
Other values (42) 1090842
32.0%
Common
ValueCountFrequency (%)
. 34
50.0%
12
 
17.6%
( 11
 
16.2%
) 11
 
16.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3406771
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 330101
 
9.7%
o 305120
 
9.0%
a 263237
 
7.7%
i 255852
 
7.5%
e 248464
 
7.3%
r 219267
 
6.4%
u 216158
 
6.3%
t 171537
 
5.0%
l 166617
 
4.9%
p 139508
 
4.1%
Other values (46) 1090910
32.0%

subgenus
Text

Missing 

Distinct637
Distinct (%)0.9%
Missing411166
Missing (%)85.9%
Memory size3.7 MiB
2025-01-23T18:14:18.374453image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length16
Mean length9.802445922
Min length4

Characters and Unicode

Total characters662067
Distinct characters50
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique153 ?
Unique (%)0.2%

Sample

1st rowTropisternus
2nd rowCarcinops
3rd rowCorticeus
4th rowHydroxenus
5th rowAcritus
ValueCountFrequency (%)
saprinus 3197
 
4.7%
blapylis 2577
 
3.8%
aeletes 2493
 
3.7%
hesperosaprinus 2388
 
3.5%
xerosaprinus 2243
 
3.3%
cercyon 2241
 
3.3%
methydrus 1851
 
2.7%
rhopalohelophorus 1849
 
2.7%
tropisternus 1806
 
2.7%
ptomister 1549
 
2.3%
Other values (627) 45347
67.1%
2025-01-23T18:14:18.622410image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 74362
 
11.2%
o 59942
 
9.1%
e 59268
 
9.0%
r 52083
 
7.9%
a 46153
 
7.0%
u 42116
 
6.4%
i 38589
 
5.8%
l 37335
 
5.6%
p 31769
 
4.8%
n 30609
 
4.6%
Other values (40) 189841
28.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 594523
89.8%
Uppercase Letter 67544
 
10.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 74362
12.5%
o 59942
10.1%
e 59268
10.0%
r 52083
8.8%
a 46153
 
7.8%
u 42116
 
7.1%
i 38589
 
6.5%
l 37335
 
6.3%
p 31769
 
5.3%
n 30609
 
5.1%
Other values (15) 122297
20.6%
Uppercase Letter
ValueCountFrequency (%)
P 9740
14.4%
C 8753
13.0%
B 6333
9.4%
A 6281
9.3%
H 5732
8.5%
S 5374
8.0%
M 4287
 
6.3%
T 4188
 
6.2%
L 2721
 
4.0%
X 2648
 
3.9%
Other values (15) 11487
17.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 662067
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 74362
 
11.2%
o 59942
 
9.1%
e 59268
 
9.0%
r 52083
 
7.9%
a 46153
 
7.0%
u 42116
 
6.4%
i 38589
 
5.8%
l 37335
 
5.6%
p 31769
 
4.8%
n 30609
 
4.6%
Other values (40) 189841
28.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 662067
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 74362
 
11.2%
o 59942
 
9.1%
e 59268
 
9.0%
r 52083
 
7.9%
a 46153
 
7.0%
u 42116
 
6.4%
i 38589
 
5.8%
l 37335
 
5.6%
p 31769
 
4.8%
n 30609
 
4.6%
Other values (40) 189841
28.7%

specificEpithet
Text

Missing 

Distinct17496
Distinct (%)5.6%
Missing167744
Missing (%)35.0%
Memory size3.7 MiB
2025-01-23T18:14:18.839633image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length31
Median length25
Mean length8.747863894
Min length2

Characters and Unicode

Total characters2720262
Distinct characters52
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7570 ?
Unique (%)2.4%

Sample

1st rowclypealis
2nd rowlateralis
3rd rowchalybaeum
4th rowplatensis
5th rowscarabaeoides
ValueCountFrequency (%)
joblingi 2649
 
0.8%
crenata 2513
 
0.8%
confluens 2040
 
0.7%
lugens 2033
 
0.7%
assimilis 1916
 
0.6%
pensylvanicus 1689
 
0.5%
alternatus 1356
 
0.4%
aranea 1288
 
0.4%
convergens 1265
 
0.4%
abbreviatus 1252
 
0.4%
Other values (17471) 293690
94.2%
2025-01-23T18:14:19.136745image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 312370
11.5%
a 305309
11.2%
s 284842
10.5%
u 212917
 
7.8%
e 192498
 
7.1%
t 173456
 
6.4%
l 171679
 
6.3%
n 170480
 
6.3%
r 168692
 
6.2%
c 137491
 
5.1%
Other values (42) 590528
21.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2717231
99.9%
Decimal Number 1044
 
< 0.1%
Other Punctuation 948
 
< 0.1%
Space Separator 728
 
< 0.1%
Dash Punctuation 285
 
< 0.1%
Close Punctuation 11
 
< 0.1%
Open Punctuation 11
 
< 0.1%
Connector Punctuation 3
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 312370
11.5%
a 305309
11.2%
s 284842
10.5%
u 212917
 
7.8%
e 192498
 
7.1%
t 173456
 
6.4%
l 171679
 
6.3%
n 170480
 
6.3%
r 168692
 
6.2%
c 137491
 
5.1%
Other values (18) 587497
21.6%
Decimal Number
ValueCountFrequency (%)
1 414
39.7%
0 327
31.3%
2 105
 
10.1%
9 73
 
7.0%
5 57
 
5.5%
4 43
 
4.1%
8 14
 
1.3%
3 6
 
0.6%
6 4
 
0.4%
7 1
 
0.1%
Other Punctuation
ValueCountFrequency (%)
. 747
78.8%
# 104
 
11.0%
* 53
 
5.6%
? 23
 
2.4%
' 13
 
1.4%
/ 8
 
0.8%
Close Punctuation
ValueCountFrequency (%)
] 8
72.7%
) 3
 
27.3%
Open Punctuation
ValueCountFrequency (%)
[ 8
72.7%
( 3
 
27.3%
Space Separator
ValueCountFrequency (%)
728
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 285
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2717231
99.9%
Common 3031
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 312370
11.5%
a 305309
11.2%
s 284842
10.5%
u 212917
 
7.8%
e 192498
 
7.1%
t 173456
 
6.4%
l 171679
 
6.3%
n 170480
 
6.3%
r 168692
 
6.2%
c 137491
 
5.1%
Other values (18) 587497
21.6%
Common
ValueCountFrequency (%)
. 747
24.6%
728
24.0%
1 414
13.7%
0 327
10.8%
- 285
 
9.4%
2 105
 
3.5%
# 104
 
3.4%
9 73
 
2.4%
5 57
 
1.9%
* 53
 
1.7%
Other values (14) 138
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2720259
> 99.9%
None 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 312370
11.5%
a 305309
11.2%
s 284842
10.5%
u 212917
 
7.8%
e 192498
 
7.1%
t 173456
 
6.4%
l 171679
 
6.3%
n 170480
 
6.3%
r 168692
 
6.2%
c 137491
 
5.1%
Other values (40) 590525
21.7%
None
ValueCountFrequency (%)
é 2
66.7%
ö 1
33.3%

infraspecificEpithet
Text

Missing 

Distinct5
Distinct (%)0.2%
Missing476240
Missing (%)99.5%
Memory size3.7 MiB
2025-01-23T18:14:19.194259image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length7
Mean length7.127685448
Min length4

Characters and Unicode

Total characters17584
Distinct characters16
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowVariety
2nd rowvariety
3rd rowVariety
4th rowVariety
5th rowVariety
ValueCountFrequency (%)
variety 2260
91.6%
aberration 156
 
6.3%
form 50
 
2.0%
race 1
 
< 0.1%
2025-01-23T18:14:19.302368image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 2622
14.9%
a 2417
13.7%
e 2417
13.7%
i 2416
13.7%
t 2416
13.7%
y 2260
12.9%
V 2178
12.4%
o 206
 
1.2%
A 156
 
0.9%
b 156
 
0.9%
Other values (6) 340
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15199
86.4%
Uppercase Letter 2385
 
13.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 2622
17.3%
a 2417
15.9%
e 2417
15.9%
i 2416
15.9%
t 2416
15.9%
y 2260
14.9%
o 206
 
1.4%
b 156
 
1.0%
n 156
 
1.0%
v 82
 
0.5%
Other values (2) 51
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
V 2178
91.3%
A 156
 
6.5%
F 50
 
2.1%
R 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 17584
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 2622
14.9%
a 2417
13.7%
e 2417
13.7%
i 2416
13.7%
t 2416
13.7%
y 2260
12.9%
V 2178
12.4%
o 206
 
1.2%
A 156
 
0.9%
b 156
 
0.9%
Other values (6) 340
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17584
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 2622
14.9%
a 2417
13.7%
e 2417
13.7%
i 2416
13.7%
t 2416
13.7%
y 2260
12.9%
V 2178
12.4%
o 206
 
1.2%
A 156
 
0.9%
b 156
 
0.9%
Other values (6) 340
 
1.9%

taxonRank
Text

Missing 

Distinct5
Distinct (%)0.2%
Missing476240
Missing (%)99.5%
Memory size3.7 MiB
2025-01-23T18:14:19.347764image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length7
Mean length7.127685448
Min length4

Characters and Unicode

Total characters17584
Distinct characters16
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowVariety
2nd rowvariety
3rd rowVariety
4th rowVariety
5th rowVariety
ValueCountFrequency (%)
variety 2260
91.6%
aberration 156
 
6.3%
form 50
 
2.0%
race 1
 
< 0.1%
2025-01-23T18:14:19.453310image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 2622
14.9%
a 2417
13.7%
e 2417
13.7%
i 2416
13.7%
t 2416
13.7%
y 2260
12.9%
V 2178
12.4%
o 206
 
1.2%
A 156
 
0.9%
b 156
 
0.9%
Other values (6) 340
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15199
86.4%
Uppercase Letter 2385
 
13.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 2622
17.3%
a 2417
15.9%
e 2417
15.9%
i 2416
15.9%
t 2416
15.9%
y 2260
14.9%
o 206
 
1.4%
b 156
 
1.0%
n 156
 
1.0%
v 82
 
0.5%
Other values (2) 51
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
V 2178
91.3%
A 156
 
6.5%
F 50
 
2.1%
R 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 17584
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 2622
14.9%
a 2417
13.7%
e 2417
13.7%
i 2416
13.7%
t 2416
13.7%
y 2260
12.9%
V 2178
12.4%
o 206
 
1.2%
A 156
 
0.9%
b 156
 
0.9%
Other values (6) 340
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17584
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 2622
14.9%
a 2417
13.7%
e 2417
13.7%
i 2416
13.7%
t 2416
13.7%
y 2260
12.9%
V 2178
12.4%
o 206
 
1.2%
A 156
 
0.9%
b 156
 
0.9%
Other values (6) 340
 
1.9%
Distinct4
Distinct (%)66.7%
Missing478701
Missing (%)> 99.9%
Memory size3.7 MiB
2025-01-23T18:14:19.501409image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length8.5
Mean length6
Min length4

Characters and Unicode

Total characters36
Distinct characters18
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)50.0%

Sample

1st rowBoisduval
2nd rowRühl
3rd rowRühl
4th rowRühl
5th rowStrecker
ValueCountFrequency (%)
rühl 3
50.0%
boisduval 1
 
16.7%
strecker 1
 
16.7%
reakirt 1
 
16.7%
2025-01-23T18:14:19.606827image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
R 4
11.1%
l 4
11.1%
ü 3
 
8.3%
h 3
 
8.3%
e 3
 
8.3%
r 3
 
8.3%
t 2
 
5.6%
a 2
 
5.6%
k 2
 
5.6%
i 2
 
5.6%
Other values (8) 8
22.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 30
83.3%
Uppercase Letter 6
 
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 4
13.3%
ü 3
10.0%
h 3
10.0%
e 3
10.0%
r 3
10.0%
t 2
 
6.7%
a 2
 
6.7%
k 2
 
6.7%
i 2
 
6.7%
d 1
 
3.3%
Other values (5) 5
16.7%
Uppercase Letter
ValueCountFrequency (%)
R 4
66.7%
S 1
 
16.7%
B 1
 
16.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 36
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 4
11.1%
l 4
11.1%
ü 3
 
8.3%
h 3
 
8.3%
e 3
 
8.3%
r 3
 
8.3%
t 2
 
5.6%
a 2
 
5.6%
k 2
 
5.6%
i 2
 
5.6%
Other values (8) 8
22.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 33
91.7%
None 3
 
8.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 4
12.1%
l 4
12.1%
h 3
9.1%
e 3
9.1%
r 3
9.1%
t 2
 
6.1%
a 2
 
6.1%
k 2
 
6.1%
i 2
 
6.1%
d 1
 
3.0%
Other values (7) 7
21.2%
None
ValueCountFrequency (%)
ü 3
100.0%

nomenclaturalCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size3.7 MiB
2025-01-23T18:14:19.650875image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1914824
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowICZN
2nd rowICZN
3rd rowICZN
4th rowICZN
5th rowICZN
ValueCountFrequency (%)
iczn 478706
100.0%
2025-01-23T18:14:19.745382image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 478706
25.0%
C 478706
25.0%
Z 478706
25.0%
N 478706
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1914824
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 478706
25.0%
C 478706
25.0%
Z 478706
25.0%
N 478706
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1914824
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 478706
25.0%
C 478706
25.0%
Z 478706
25.0%
N 478706
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1914824
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 478706
25.0%
C 478706
25.0%
Z 478706
25.0%
N 478706
25.0%